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1.0  SUMMARY 

Rapid,  accurate  detection  of  biological  and  chemical  threats  is  a  critical  need  of  today’s 
military.  The  key  component  to  such  detection  is  identification  of  recognition  elements 
specific  to  the  target  of  interest,  which  will  later  be  integrated  onto  a  sensor  platform. 

Aptamers  have  been  successfully  used  as  recognition  elements  for  targets  ranging  from  ions  to 
whole  cells.  However,  aptamer-based  sensor  design  has  two  main  challenges:  1)  Aptamers  are 
selected  in  solution  and,  as  such,  their  sequences  are  not  optimized  to  function  when 
immobilized  on  a  surface;  2)  The  initial  aptamer  selection  process  is  time  consuming,  which 
slows  sensor  development. 

Alternatively,  oligonucleotide  microarrays  are  attractive  tools  for  aptamer  selection  and  sensor 
applications,  offering  several  advantages  over  solution-based  selection.  First,  selection  is 
performed  with  oligonucleotides  immobilized  on  a  solid  substrate,  thus  binding  is  not  affected 
by  proximity  to  the  sensor  surface,  and  more  stringent  washing  conditions  can  be  utilized  to 
increase  the  efficiency  of  the  system.  Another  advantage  is  that  arrays  are  commercially 
available  and  contain  up  to  one  million  unique  sequences  which  can  be  tested  in  a  single 
experiment,  greatly  reducing  the  time-frame  of  selection.  Microarrays  also  exclude  a 
requirement  for  multiple  round  cycling,  polymerase  chain  reaction  (PCR),  cloning,  and 
sequencing  necessary  in  traditional  aptamer  selection  methods. 

While  the  initial  library  is  composed  of  fewer  sequences  than  solution-based  methods,  in  silico 
screening  provides  an  optimization  step  on  the  starting  pool  to  increase  the  probability  of 
identifying  sequences  with  strong  ligand  binding.  Therefore,  we  will  improve  the  efficiency  of 
aptamer  selection  in  a  microarray  experimental  setting  using  oligonucleotide  libraries  enriched 
by  pre-screening  the  potential  landscape  of  binders  in  silico.  In  our  customized  patterning,  a 
library  of  oligonucleotides  was  generated  by  applying  sequence  constraints  based  on  various 
parameters  such  as  total  length  and  minimum/maximum  number  of  paired  bases,  thus  resulting 
in  an  increased  probability  of  target  binding.  Two  hundred  thousand  sequences  fitting  the 
constraints  were  synthesized  onto  custom  microarray  chips  in  replicate  in  order  to 
experimentally  test  the  affinities  of  the  oligonucleotides  for  specific  targets.  From  the  work, 
we  were  able  to  identify  potential  protein-binding  sequences,  several  possible  small  molecule¬ 
binding  sequences,  and  develop  a  method  for  testing  oligonucleotides  for  G-quartet  formation 
by  microarray.  Generally  applicable  findings  include: 

1 .  Linker  length  increases  will  increase  fluorescence  intensity  by  extending  the 
probes  from  the  surface,  and  some  minimal  length  is  needed  to  obtain  reliable 
data. 

2.  The  linker  identity  (A,  C,  G,  or  T)  affects  fluorescence  intensity,  with  T  base 
linkers  proving  reliable  across  all  linker  lengths  and  target  concentrations. 

3.  A  direct  (Cy3-protein)  and  indirect  (biotin-protein  followed  by  Cy3-streptavidin) 
target  labeling  method  were  compared,  and  it  was  shown  that  the  indirect  method 
both  enhances  S/N  by  orders  of  magnitude  and  decreases  nonspecific  binding  to 
controls. 

4.  The  planar  and/or  aromatic  reporter  tags  interacted  extensively  with  the  DNA. 

5.  The  surface  density  of  the  arrays  is  too  low  to  expect  binding  from  a  random 
library,  so  starting  library  optimization  must  be  performed. 
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6.  Purification  of  free  dye  from  dye -protein  conjugates  is  essential,  and  the 

manufacturer’s  method  is  insufficient  for  aptamer  identification  experiments. 

The  ideal  course  of  action  for  those  wishing  to  pursue  microarray  technology  for  aptamer 
identification  is  to  entirely  eliminate  the  need  for  a  tag  to  be  conjugated  to  the  target. 
Otherwise,  it  is  extremely  difficult  to  determine  whether  lower-than-control  level  binding  is 
real  target  binding  or  nonspecific  tag  interaction.  Biotin  target  conjugation  is  preferable  to 
Cy3 -conjugation  if  necessary.  Control  experiments  where  sequences  displaying  binding  to  the 
tag  alone  can  help  to  alleviate  this  concern  as  well.  Future  researchers  should  take  into 
consideration  how  to  design  the  initial  library,  because  this  is  likely  the  main  determinant  of 
the  success  of  the  experiment.  Studying  the  sequences  and  structures  of  existing  aptamers  to 
the  target  or  similar  targets  could  aid  in  this  pursuit,  but  does  not  necessarily  guarantee 
increased  success,  especially  considering  that  the  aptamer  need  is  centered  on  targets  that  do 
not  currently  have  aptamers  available. 

This  work  aims  at  rapid  identification  and  characterization  of  specific,  high-affinity  binders  for 
different  target  molecules  including  biomarkers  and  chemical  warfare  agents  for  future 
applications  in  biosensors,  beginning  with  proof-of-concept  studies  with  a  well-known  protein 
aptamer/target  pair.  The  work  presented  has  potential  to  create  technologies  that  can  expedite 
the  identification  of  aptamer  recognition  elements  for  targets  including  chemical  warfare 
agents,  explosives,  and  biomarkers.  In  turn,  these  technologies  will  enable  faster  integration 
into  detection  devices. 
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2.0 


INTRODUCTION 


2.1  Systematic  Evolution  of  Ligands  for  Exponential  Enrichment  (SELEX)  Aptamer 
Selection 

12  3 

Aptamers  are  selected  using  a  method  known  as  SELEX  ’  depicted  in  Figure  1  .  The  SELEX 

1  o  1  r 

procedure  begins  with  10  -10  unique  sequences  from  a  chemically  synthesized,  randomized 
oligonucleotide  library  competing  for  binding  to  the  target.  This  library  consists  of  sequences 
designed  with  two  PCR  primer  regions  flanking  a  random  region  of  30-50  nucleotides.  The 
sequence  space,  or  number  of  different  base  permutations,  of  this  library  is  defined  as  4N;  for  a 
30-base  random  region  1X1018  base  combinations  are  possible.  This  results  in  an  under¬ 
represented  unique  library  where  each  sequence  in  the  initial  1015  appears  only  once  in  the 
starting  library  pool,  and  a  portion  of  the  sequence  space  is  excluded.  Sequences  not  binding 
the  target  are  partitioned  from  binding  oligonucleotides,  and  binders  are  eluted  from  the  target. 
This  partitioning  step  (separating  the  binding  sequence  from  binders)  is  the  main  determinant 
of  the  efficiency  of  the  selection.  The  binding  sequences  are  PCR-amplified  and  converted  to 
single-stranded  deoxyribonucleic  acid  (ssDNA)  for  the  next  round  of  selection.  Typically  after 
the  first  round,  researchers  may  institute  a  counter-selection  step  in  which  sequences  binding  to 
a  control  (such  as  support  matrix  or  a  molecule  similar  in  structure  to  the  target)  are  removed 
from  solution,  and  those  that  do  not  bind  to  the  control  are  retained  for  future  SELEX  rounds. 
The  process  is  repeated  in  cyclical  fashion  until  the  final  pool  is  enriched  for  sequences  binding 
to  the  target.  Following  enrichment,  the  oligonucleotide  pool  is  cloned/sequenced,  individual 
sequences  are  synthesized  to  test  for  target  binding,  and  successful  candidates  are  characterized 
in  terms  of  binding  affinity,  target  specificity,  etc.  Frequently,  the  aptamers  are  truncated  after 
the  preliminary  in  vitro  studies  to  remove  regions  which  do  not  contribute  to  binding.  This 
reduces  the  cost  of  aptamer  synthesis,  increases  the  yield,  and  may  actually  act  to  stabilize  the 
aptamer,  resulting  in  an  increased  affinity  for  the  target4.  Researchers  may  also  perform 
mutations  and  insertions/deletions  to  a  truncated  aptamer  to  determine  consensus  regions 
necessary  for  binding  or  identify  higher  affinity  aptamers5. 
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SELEX  has  been  applied  to  targets  ranging  from  ions  and  small  molecules  to  proteins  and 
whole  cells.  Our  main  interests  are  to  select  aptamers  that  bind  to  biomarkers  of  physiological 
states  such  as  stress,  fatigue,  and  vigilance.  Many  of  these  biomarkers  are  small  molecule 
targets  and  thus  pose  a  challenge  for  aptamer  selection  and  subsequent  biosensor  integration. 
Aside  from  the  abundance  of  biomarker  options,  one  downstream  application  of  a  selected 
aptamer  would  be  integration  into  a  riboswitch  construct  which  is  found  in  bacteria  and  other 
microorganisms.  Riboswitches  are  able  to  regulate  a  selected  activity  by  changes  induced  by 
binding  a  small  molecule  target.  (Although  riboswitches  are  ribonucleic  acid  (RNA)  level 
constructs,  once  the  described  methods  are  successful  with  DNA,  they  can  later  be  transitioned 
to  RNA  work  required  specifically  for  this  application.)  However,  SELEX  on  small  molecule 
targets  is  more  challenging  than  standard  protein  targets  because  of  the  difficulty  of  applying 
traditional  partitioning  methods.  Most  methods  rely  on  a  large  difference  in  molecular  weight 
or  mobility  characteristics  between  the  bound  aptamer/target  complex  and  free  non-binding 
sequences6.  For  example,  nitrocellulose  membranes  employ  electrostatic  interactions  between 
the  negatively  charged  polymer  and  positively  charged  proteins  to  separate  binders  from  non 
binders.  When  a  mixture  of  the  target  and  DNA  is  passed  through  the  filter,  the 
oligonucleotides  associated  with  the  protein  will  be  retained  by  the  membrane  while  the  free 
sequences  will  be  rejected  due  to  their  negative  charges.  Similarly,  gel  electrophoresis  and 
capillary  electrophoresis  separate  samples  based  on  a  change  in  electrophoretic  mobility  where 
the  difference  may  not  be  enough  for  separation  with  small  molecule  targets  depending  on  the 
properties  of  the  system.  Immobilizing  the  small  molecule  can  be  chemically  difficult  and  can 
change  the  binding  properties  such  that  binding  is  reduced  or  destroyed  in  free  solution. 


4 

Distribution  A:  Approved  for  public  release;  distributuion  is  unlimited. 
88 AB W -20 1 4-045 8 ;  Cleared  11  February  2014 


Therefore,  there  are  fewer  small  molecule  aptamers  reported  in  the  literature,  and  the  need  to 
rapidly  develop  aptamers  to  targets  relevant  to  biosensor  work  is  apparent. 

While  SELEX  has  been  tremendously  successful  in  attaining  high  affinity  aptamers  for  a 
variety  of  targets,  the  time  and  labor  investments  make  high-throughput  applications 
prohibitive.  For  example,  the  number  of  cycles  required  varies  depending  on  the  conditions 
used  in  the  selection;  however,  a  typical  selection  requires  an  average  of  12  cycles  and  a 
timeline  of  two-six  months,  without  including  initial  optimization  processes  or  validation  of 
aptamer  candidates6,7.  This  is  mainly  due  to  an  inherently  low  partitioning  efficiency  of 
SELEX,  in  which  1  in  1X1 09  to  1X1013  sequences  are  binders6,  but  1012  nonspecific  binders 
will  be  selected  in  the  first  round  of  the  selection  due  to  limitations  of  the  separation  method, 
and  an  inability  to  differentiate  the  few  high  affinity  binders  from  the  multitude  of  nonspecific 
binders  with  affinities  several  orders  of  magnitude  smaller  than  specific  interactions8. 
Therefore,  the  number  of  nonspecific  binders  selected  is  initially  similar  or  in  some  cases 
larger  than  the  amount  of  specific  binders.  This  is  why  several  rounds  of  SELEX  are  required 
to  subtract  the  nonspecific  “noise”  from  the  specific  binders  through  increased  stringency  of 
conditions.  Additionally,  the  stringency  is  often  intentionally  very  low  in  the  first  round  so  the 
rare  binding  sequences  are  not  lost  during  the  partitioning  process.  As  described  above,  each 
sequence  in  the  initial  library  is  present  as  an  individual  in  the  pool,  so  loss  of  a  binding 
sequence  in  the  first  round  will  remove  it  from  contention  as  a  binding  candidate  at  the  end  of 
the  selection. 

A  further  issue  with  SELEX  lies  in  the  bias  introduced  into  the  selection  by  PCR  in  each  round 
of  selection.  PCR  has  been  reported  to  amplify  different  oligonucleotides  unequally,  resulting 
in  an  inaccuracy  of  comparative  representation  within  a  pool  as  the  selection  progresses9.  The 
stability  of  structures  formed  by  aptamers  binding  the  target  may  also  decrease  the  efficiency 
by  which  the  tightest  binders  are  amplified,  and  the  polymerase  introduces  mutations  into  the 
sequences  as  well.  A  second  form  of  bias  is  introduced  by  the  cloning  and  Sanger  sequencing 
method  typical  for  aptamer  identification.  The  pool  is  cloned  into  an  appropriate  vector  (which 
has  a  unique  set  of  factors  affecting  cloning  efficiency)  and  transformed  into  competent 
bacterial  cells  to  spatially  deconvolute  the  sequences  into  individual  colonies  representing  one 
sequence  each,  then  typically  ~100  colonies  are  randomly  chosen  from  thousands  of  viable 
colonies  for  sequencing10.  We  will  use  the  results  of  a  cortisol  selection  (previous  work 
performed  in  this  lab)  as  an  example  to  illustrate  the  concern  with  covering  such  a  small 
portion  of  the  sequence  space.  At  the  end  of  selection,  the  final  cortisol  pool  contained 
1.37X10'  moles  (8.25X10  molecules)  of  DNA.  This  cloning  protocol  would  represent  only 
1  in  8.25X1010  total  sequences  for  the  cortisol  selection.  It  would  likely  reflect  the  most 
abundant  sequence  present,  but  may  not  report  those  that  have  artificially  lower  numbers  due 
to  factors  such  as  PCR  bias  or  cloning  efficiency.  Cho  et.  al.  have  shown  that  removing  the 
cloning  bias  and  using  next-generation  sequencing  methods  that  sample  a  larger  percentage  of 
the  pool  (105-107  sequences)  produced  aptamers  with  affinities  3-8-fold  higher  than  a  similar 
selection  performed  with  cloning  and  Sanger  sequencing.10  Thus,  reducing  or  eliminating  the 
reliance  on  PCR  and  cloning  would  give  a  higher  probability  of  identifying  the  true  highest 
affinity  sequences. 
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2.2  Microarray-Based  Aptamer  Identification 


We  propose  to  use  microarrays  as  a  faster,  more  efficient  method  of  performing  aptamer 
identification  and  characterization  studies.  DNA  microarrays  function  by  identifying  locations 
of  fluorescence  (from  fluorescently  labeled  target)  and  correlating  them  with  the  position  of 
known  sequences  covalently  synthesized  on  the  array  (Fig  2).  A  fluorescently  labeled  target  is 
added  and  an  image  of  the  array  is  captured  using  a  microarray  scanner  (Agilent  High- 
resolution  Microarray  Scanner  currently  on  loan  from  Agilent  Technologies).  Alternatively, 
the  target  can  be  biotinylated  and  allowed  to  interact  with  the  array,  followed  by  a  short  Cy3- 
streptavidin  incubation.  Our  initial  studies  were  performed  with  protein  targets,  but  we  are 
investigating  options  to  transition  the  technology  to  small  molecule  biomarkers. 


Direct  Labeling.  Method 


106  sequences 

\  i 


Indirect  Labeling  Method 


Dye-labeled 

streptavidin 

(short) 


Location 

Sequence 

Fluorescence 

A1 

AATGACTCG... 

28 

A2 

CTAGAGTCA... 

6 

A3 

ACGGGTACT... 

16456 

A4 

GGTACGATA... 

137 

A5 

CAGAAAATC... 

5361 

A6 

TTTACTATAG... 

10679 

Figure  2:  Methods  of  Microarray  Aptamer  Identification 

The  main  advantage  of  using  microarrays  for  this  purpose  is  the  time  saved  by  the  ability  to 
complete  studies  of  many  truncates  and  mutations  in  parallel  rather  than  individually.  One 
microarray  experiment  can  be  completed  in  less  than  a  day,  with  data  analysis  requiring 
another  1-2  days.  This  characteristic  is  a  function  of  the  improved  partitioning  efficiency 
available  by  covalently  linking  the  sequences  to  the  surface.  Higher  stringency  conditions  can 
be  applied  to  identify  sequences  with  better  binding  properties.  Also,  no  PCR,  cloning,  or 
sequencing  is  necessary  due  to  design  of  the  microarray  with  sequences  in  known  locations. 
The  arrays  are  fully  customizable  so  the  user  can  define  the  exact  sequences  of  interest 
produced  on  the  arrays.  Cost  savings  are  also  significant  in  that  one  array  at  a  price  of  ~$600 
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(Agilent  1X1 M  array)  can  test  up  to  a  million  sequences  simultaneously,  as  opposed  to  ~$135 
for  one  sequence  HPLC  purified  from  Integrated  DNA  Technologies  (IDT)  for  traditional 
analysis. 

Prior  to  today’s  array  technology,  some  of  the  issues  related  to  the  use  of  microarrays  in 
aptamer  studies  were  associated  with  the  techniques  used  to  deposit  the  oligonucleotides11. 
First,  the  cost  of  synthesizing  thousands  of  sequences  for  experiments  was  prohibitive. 

Second,  spot  size  reproducibility,  specifically  related  to  the  conditions  used  for  spotting  the 
arrays  (humidity,  buffer  used,  etc),  was  variable.  To  avoid  these  issues,  recent  developments 
in  microarray  technologies  provide  the  option  of  synthesizing  the  oligonucleotides  from  the 
surface  of  the  slides12.  This  significantly  reduces  the  cost  of  manufacturing  the  chip  and  offers 
exquisite  control  over  the  oligonucleotide  synthesis.  Several  commercial  companies 
(MYcroarray,  Agilent,  Nimblegen)  have  essentially  eliminated  these  issues  and  offer  fully 
customizable  array  product  options  up  to  80  bases  long.  However,  this  technology  has  been 
mostly  developed  for  gene  expression  studies  that  function  by  direct  hybridization  of  the 
fluorescent-tagged  genomic  fragments  to  cDNA  on  the  microarray.  Therefore,  protocols  and 
methods  for  aptamers  must  be  developed  and  optimized  by  the  individual  researcher  with 
challenges  specific  to  this  application. 

Significant  work  has  been  performed  by  different  groups  in  the  use  of  aptamers  for  target 
detection  in  custom-made  microarray  settings.  Many  parameters  will  affect  the  aptamer 
response  that  are  not  as  significant  in  cDNA  hybridizations,  including:  probe  density, 
proximity  of  the  binding  site  to  the  surface,  immobilization  orientation  (5'  or  3'  end)  and  even 
the  buffer  used  in  the  studies13,14.  It  has  been  confirmed  that  known  DNA  aptamers  can  be 
immobilized  on  a  microarray  and  conditions  to  preserve  their  activity  can  be  found11,  and  the 
conditions  used  for  these  studies  should  mimic  the  conditions  used  during  the  initial  solution- 
based  selection  process15.  Additionally,  the  response  of  an  aptamer  in  solution  can  be 
significantly  different  than  when  it  is  attached  to  an  array  surface16.  Several  studies  report 
differences  in  aptamer  affinity  when  comparing  microarray  and  solution-based  dissociation 
constant  measurements14,17.  In  most  cases  the  affinities  of  these  aptamers  were  significantly 
lower  on  the  microarray  than  when  tested  in  the  solution-based  conditions  used  during  their 
selection15.  So  applying  a  microarray  modified  aptamer  to  riboswitch  conditions  would  likely 
improve  the  aptamer  function  relative  to  the  array,  and  when  applied  to  more  traditional 
biosensor  platforms,  it  would  ensure  binding  activity  is  retained. 

In  the  initial  SELEX  introduction,  we  touched  on  the  fact  that  post-SELEX  modifications  are 
often  performed  to  truncate  then  introduce  mutations  to  improve  function  or  study  some  of  the 
structural  properties  of  aptamer/target  binding.  These  studies  give  rise  to  a  number  of 
advantages,  including  a  reduced  cost  of  synthesis,  improved  yield,  and  potential  improvement 
in  the  affinity  of  the  aptamer  for  the  target.  Standard  methods  of  aptamer  truncation  include 
radioactive  labeling  and  fragmentation,  followed  by  synthesis  and  purification  of  active 
sequences5.  A  less  labor  intensive  method  involves  analysis  of  the  individual  sequences  of  a 
homologous  family  obtained  after  pool  sequencing  and  alignment.  However,  this  requires  the 
synthesis,  purification,  and  affinity  characterization  of  many  sequences  in  a  family,  also  a  time- 
consuming  process.  Similarly,  study  of  mutations  requires  a  second  complete  SELEX  to 
identify  higher  affinity  aptamers.  Biesecker  et.  al.  (1999)  selected  aptamers  to  human 
complement  C5  with  dissociation  constants  20-40  nM  after  12  rounds18.  They  then  used  the 
predicted  secondary  structure  of  the  chosen  best  aptamer  as  a  template  to  perform  a  second  8- 
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round  “biased”  SELEX  to  improve  the  dissociation  constant  of  the  highest  affinity  sequences 
to  2-5  nM  with  full  functional  activity. 

Fischer  et.  al.  has  shown  that  microarrays  are  capable  of  exploring  truncations  of  the  well- 
studied  immunoglobulin  E  (IgE)  aptamer19.  They  were  able  to  confirm  the  importance  of  a  21- 
base  consensus  sequence  and  document  the  minimum  truncated  sequence  demonstrating 
fluorescence  intensity  similar  to  the  original  structure.  They  found  that  the  consensus  sequence 
was  generally  confined  to  the  loop  in  a  stepdoop  structure,  and  the  stem  was  more  amenable  to 
manipulation  than  the  consensus  loop.  Katilius  et.  al.  introduced  a  series  of  single,  double,  and 
triple  mutations  to  the  IgE  aptamer  by  microarray17.  They  were  able  to  improve  binding 
relative  to  the  initial  aptamer  by  manipulating  the  bases  in  the  stem  region,  and  showed  that 
single  base  mutations  can  completely  destroy  target  binding.  Collectively  these  two  studies 
concluded  that  extending  the  aptamers  from  the  array  surface  enhances  target  binding,  the 
fluorescence  intensity  is  proportional  to  the  concentration  of  target  added,  and  higher  relative 
fluorescence  values  between  probes  on  the  same  array  indicates  higher  affinity. 

Three  main  works  in  the  literature  have  actually  used  arrays  to  perform  aptamer  selection 
rather  than  simply  employing  them  for  detection  or  structural  examination20"22.  The  starting 
libraries  consisted  of  10  -10  initial  sequences  and  evolved  aptamers  through  in  silico  genetic 
algorithms  in  multiple  chip  generations  (rounds).  Each  of  them  either  used  naturally 
fluorescent  targets,  utilized  a  target  with  well-studied  aptamer  binding  motifs,  or  took  into 
account  the  characteristics  of  known  aptamers  for  library  design. 

As  a  whole,  this  body  of  work  from  the  literature  makes  several  important  points  advocating 
the  proposed  applications:  1)  It  is  possible  to  detect  targets  using  arrays;  2)  Array  performance 
will  vary  with  the  conditions  used,  and  should  be  as  close  to  the  selection  conditions  as 
possible;  3)  The  immobilization  of  the  aptamers  in  an  array  setting  likely  will  not  negatively 
impact  sequences  applied  to  riboswitches  and  would  benefit  traditional  biosensor  detection 
platforms;  4)  Microarrays  can  be  used  for  truncation/mutation  studies  and  provide  valuable 
insights  into  aptamer  structure;  5)  Probes  can  be  evaluated  in  massively  parallel  format,  and 
their  relative  fluorescence  intensities  can  be  compared  to  indicate  higher  affinity  binders;  6) 
Aptamer  selection  is  possible  if  the  starting  library  is  “designed”  to  contain  potential  binders. 

Some  possible  technical  challenges  for  applying  this  method  to  aptamer  modifications  for 
newly  selected  targets,  where  the  behavior  of  the  sequence  is  not  well-documented  in  the 
literature,  include  nonspecific  dye  interactions  and  small  molecule  detection.  Most  dye 
systems  used  to  label  proteins  for  microarray  work  possess  conjugated  7i-systems  reported  to 
interact  with  DNA  directly 20.  This  would  make  it  difficult  to  differentiate  between  sequences 
that  are  binding  the  actual  protein,  or  those  that  instead  interact  with  the  dye.  The  Gold  team 
reported  using  photoactive  base  substitutions  to  covalently  cross-link  proteins  to  the  arrays, 
followed  by  stringent  washing  steps  to  remove  nonspecific  binders23.  However,  current 
commercial  microarray  manufacturers  do  not  offer  these  base  substitutions  in  their  array 
synthesis.  A  feasible  method  of  biotinylating  the  protein  followed  by  a  short  dye-labeled 
streptavidin  incubation  has  been  shown  to  demonstrate  less  nonspecific  binding  compared  to  a 
direct  dye  labeling20,24.  Also,  while  methods  exist  to  modify  proteins  with  biotin  or 
fluorescence  tags  largely  without  affecting  their  binding  properties,  doing  so  to  a  small 
molecule  biomarker  target  may  prove  more  difficult.  These  tags  often  alter  the  binding 
properties  of  the  target,  and  their  comparable  size  to  the  target  molecule  tends  to  dominate  the 
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interaction  with  DNA.  The  chemistry  to  covalently  link  the  tag  to  the  target  will  vary 
depending  on  the  target  structure. 

A  major  drawback  is  that  the  highest  density  arrays  have  a  maximum  of  ~  1 06  sequences.  This 
is  in  stark  contrast  to  SELEX  methods,  which  evolve  from  an  initial  library  of  ~1015  sequences. 
However,  the  success  of  a  selection  is  more  dependent  on  the  number  of  binders  present  in  the 
initial  pool  than  a  raw  total  number  of  sequences,  and  combinatorial  drug-screening  libraries 
identify  binders  from  1 03- 1 06  compounds25.  Therefore,  biasing  a  DNA  or  RNA  starting  library 
with  a  maximum  of  106  sequences  to  have  a  higher  probability  of  containing  aptamers  should 
enable  microarray  identification  of  binders. 

2.3  In  Silico  Starting  Library  Design 

In  this  work  we  designed  a  number  of  different  starting  library  patterns  in  order  to  circumvent 
the  microarray  density  problem  and  increase  the  chances  of  identifying  aptamers  binding  the 
target.  Aptamers  usually  show  complex  secondary  and  tertiary  structures  composed  of  a 
number  of  features  like  loops,  stems,  and  multi-arm  junctions  that  interact  with  each  other  to 
stabilize  the  structure  of  the  aptamer-complex  target  .  It  has  been  determined  that  the 
probability  for  a  sequence  to  bind  a  target  improves  with  increasing  the  structural  complexity27. 
This  means  unstructured  sequences  or  oligonucleotides  that  form  simple  structures  have 
reduced  potential  to  show  any  type  of  function.  In  a  typical  SELEX  experiment,  a  random 
pool  of  oligonucleotides  is  used  to  search  for  binders  for  a  particular  target.  Constituents  of  a 
random  pool  have  mostly  unpaired  regions  combined  with  short  (low  stability)  stem-loop 
structures,  and  the  probability  of  finding  more  complex,  high-affinity  aptamers  in  the  starting 
random  library  is  very  low28.  This  was  experimentally  confirmed  by  Davis  and  Szostak  who 
selected  aptamers  from  a  mixture  of  fully  random  and  partially  structured  RNA  libraries29. 
They  identified  6  sequence  families  of  aptamers.  Four  of  these  families  came  from  the 
designed  library  while  two  families  were  from  the  random  pool.  It  was  also  found  that 
aptamers  with  the  highest  affinity  were  selected  from  the  designed  library  of  sequences  with  an 
engineered  multi-base  (high  stability)  stem-loop. 

Recently,  there  have  been  several  attempts  to  design  enhanced  initial  pools  of  sequences  for 
selection  of  high-affinity  aptamers.  Luo  et  a\.,  have  developed  Random  Filtering  and  Genetic 
Filtering  methods  to  increase  the  number  of  five-way  junctions  or  to  design  a  uniform  structure 
distribution  in  the  starting  RNA/DNA  pools30.  One  of  the  aptamers  selected  by  SELEX  from 
the  designed  structural  libraries  displayed  higher  affinity  for  ATP  than  a  previously  selected 
low-complexity  aptamer  while  other  aptamers  showed  weaker  affinity.  Thus  higher 
complexity  itself  does  not  always  lead  to  better  affinity.  In  another  approach  Ruff  et  al., 
designed  a  patterned  library  to  enhance  the  formation  of  stem- loop  structures31.  For  the  three 
performed  SELEX  selections,  the  patterned  library  substantially  outperformed  the  unpattemed 
library  in  two  cases  and  performed  at  least  as  well  as  the  unpattemed  in  the  third  case. 

Here,  we  present  a  set  of  related  libraries  that  have  been  compared  by  their  abilities  to  produce 
structurally  enriched  sequences  by  applying  a  series  of  constraints  (see  Section  3.4).  This  set  is 
composed  of  four  different  patterned  libraries  of  50nt  DNA  sequences  used  as  an  initial  pool 
for  aptamer  selection: 
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Liu  -  (RY)3-N4-(RY)4-N3-(RY)4-N4-(RY)4-N3-(RY)3 

PT1  -  (RRYY)2-N4-(RRYY)-N3-(RRYY)-N4-(RRYY)-N3-(RRYY)-N4-(RRYY)2 
PT2  -  (RRYY)2-N4-(RRRYYY)-N4-(RRRYYY)-N4-(RRRYYY)-N4-(RRYY)2 
PT3  -  (RRYY)2-N4-(RY)3-N4-(RY)3-N4-(RY)3-N4-(RRYY)2 


R=(A,G),  Y=(T,C),  N=(A,C,G,T) 

The  first  library  (modified  from  Liu)  has  a  pattern  similar  to  one  proposed  by  Ruff  et  al31  from 
the  Liu  group.  It  consists  of  alternating  purines  (R=A,G)  and  pyrimidines  (Y=T,C)  separated 
by  completely  random  regions  of  N4  and  N3 .  The  second  pattern  (PT1)  was  designed  to 
maximize  the  number  four- way  junctions  as  depicted  in  Figure  330.  Pattern  Liu  has  14 
completely  random  bases  while  pattern  PT1  has  18  random  bases.  Pattern  PT2  and  PT3  were 
designed  with  the  same  number  of  random  bases,  16,  but  PT2  allows  for  three  of  the  same 
purine  or  pyrimidine  in  a  row  while  PT3  forces  alternation  of  internal  purines  and  pyrimidines. 


Figure  3:  Example  of  Four-Way  Junction  Structure  Generated  with  Pattern  PT1 

In  this  work,  we  used  in  silico  patterned  libraries  to  “design”  the  starting  library  to  possess  a 
higher  probability  of  containing  binding  sequences  on  a  microarray  in  order  to  discover  new 
IgE  aptamers.  IgE  is  part  of  a  well-characterized  aptamer/target  pair  with  positive  and  negative 
controls  ideal  for  initial  studies.  Once  validated,  methods  will  be  modified  to  target 
biomarkers  of  interest.  The  utility  of  a  direct-  and  indirect-  labeling  method  were  compared. 
We  were  also  able  to  gather  information  indicating  that  microarrays  can  be  used  to  probe 
aptamer  tertiary  structures  and  can  even  be  applied  to  small  molecule  aptamer  identification. 
The  aim  of  the  work  is  to  serve  as  proof-of-principle  studies  that  microarrays  can  speed  up 
the  aptamer  identification  and  characterization  process  to  rapidly  provide  recognition 
elements  to  various  biomarkers  of  interest  for  “ plug-and-play ”  integration  into  riboswitch  or 
traditional  biosensor  detection  platforms. 
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3.0 


METHODS,  ASSUMPTIONS,  AND  PROCEDURES 


3.1  Chemicals  and  Equipment 

IgE  ( Fitzgerald),  Illustra  NAP-25  desalting  columns  and  Cy3  Mono-Reactive  Dye  Pack  (GE 
Healthcare),  NanoDrop  (Thermo  Scientific),  nuclease  free  water  (Gibco).  Microarray 
equipment:  custom  8X15k  and  1X1M  DNA  microarrays,  8X15k  and  1X1M  gasket  slides, 
ozone  barrier  slides,  hybridization  chambers,  scanner  cassettes,  hybridization  oven,  and  High- 
resolution  Microarray  Scanner  (all  Agilent).  Slide  rack  and  wash  dishes  (Shandon),  Kimtech 
polypropylene  wipes  (Kimberly-Clark).  FluoReporter  Mini-biotin-XX  Protein  Labeling  Kit 
(Invitrogen),  Streptavidin-Cy3  Conjugate  and  HABA/Avidin  Reagent  (Sigma),  dialysis 
cassettes  (3.5k  MWCO,  Thermo  Scientific). 

Buffers:  Binding  [PBSMTB]-  IX  PBS  (8.1  mM  Na2HP04,  1.1  mM  KH2P04,  2.7  mM  KC1,  137 
mM  NaCl,  pH  7.4)  +  1  mM  MgCl2  +  0.1%  Tween-20  and  1%  BSA;  Washing  [PBSM]-  IX 
PBS  (8.1  mM  Na2HP04,  1.1  mM  KH2P04,  2.7  mM  KC1,  137  mM  NaCl,  pH  7.4)  +  1  mM 
MgCl2;  Rinse  [R]-  1/4  dilution  of  PBSM  and  nuclease  free  water.  RB  Wash  Buffer:  20  mM 
HEPES,  300  mM  KC1,  5  mM  MgCl2;  RB  Binding  Buffer:  RB  wash  buffer  +  0.1%  Tween-20  + 
1%  BSA.  IgE  binding  buffer  without  K+:  8.1  mM  Na2HP04,  137  mM  NaCl,  pH  7.4)  +  1  mM 
MgCl2  +  0.1%  Tween-20  and  1%  BSA;  Washing-  8.1  mM  Na2HP04,  137  mM  NaCl,  pH  7.4  + 
1  mM  MgCl2 

3.2  Direct  Labeling  Method-  Immunoglobulin  E  (IgE)  Cy3-Labeling 

IgE  was  diluted  with  0.1  M  sodium  carbonate  buffer  (pH=9.3)  to  1  mg/mL.  One  mL  IgE 
dilution  was  added  to  one  Cy3  dye  pack  and  incubated  30  min.  Free  dye  was  separated  from 
IgE-bound  dye  by  purification  with  a  desalting  column.  Dye  to  protein  ratio  (D/P)  was 
calculated  using  the  manufacturers’  instructions  with  the  aid  of  Nanodrop  UV/Vis  detection. 

3.3  8X15k  Microarray  Optimization 

The  8X1 5k  microarrays  were  employed  to  optimize  conditions  based  on  the  response  of 
several  reported  IgE  binding  and  nonbinding  sequences.  Initial  library  patterning  was  reserved 
for  1X1M  microarrays,  which  could  accommodate  all  50k  sequences  in  replicate  (although 
8X15k  arrays  were  “filled”  with  either  7k  or  5k  PT1  sequences  in  duplicate).  Blocking  with 
PBSMTB  was  performed  on  the  DNA  microarray  loaded  into  a  50  mL  conical  tube  for  1  h  at 
room  temperature.  Slides  were  disassembled  in  PBSM  buffer  and  rinsed  for  5  min  at  room 
temperature  using  a  stir  plate.  The  slides  were  quickly  edge-tapped  to  remove  excess  buffer. 
Seventy  pL  protein  (variable  concentrations)  in  PBSMTB  was  loaded  onto  the  gasket  slide 
then  incubated  with  each  of  the  8  DNA  arrays  for  2  hrs  at  20°C  in  a  hybridization 
chamber/hybridization  oven.  Slides  were  disassembled  in  PBSMTB  buffer  and  washed  for  3 
min  in  a  separate  PBSM  buffer  with  the  slide  rack  and  stir  plate,  then  the  slide  rack  was 
transferred  to  1/4  PBSM  buffer/water  for  1  min  using  a  stir  plate.  Slides  in  the  slide  rack  were 
then  dipped  in  nuclease  free  water  to  remove  any  remaining  salt  and  washed  for  1  min  with  stir 
plate.  The  rack  was  slowly  withdrawn  from  the  water  to  promote  a  drier  surface,  the  back  of 
the  slide  was  wiped  with  ethanol  and  then  placed  in  a  50  mL  conical  tube  with  a  polypropylene 
wipe  at  the  bottom  and  centrifuged  at  4150  rpm  for  3  min.  The  microarray  was  loaded  into  a 
scanner  cassette  and  covered  with  an  ozone  barrier  slide  before  scanning. 


11 

Distribution  A:  Approved  for  public  release;  distributuion  is  unlimited. 
88 AB W -20 1 4-045 8 ;  Cleared  11  February  2014 


3.4  Microarray  Starting  Library  Design 

UNAFold  software  was  used  to  screen  DNA  from  each  pattern  and  determine  which  sequences 
were  folding.  Code  was  written  in  Perl  to  extract  sequences  which  fit  a  series  of  constraints: 
the  1st  base  must  pair  with  the  50th  base,  the  total  number  of  unpaired  bases:  10<unpaired<30, 
there  must  be  a  minimum  of  two  >4-unpaired  base  stretches,  T=25°C,  [Na+]=  100  mM, 
[Mg2+]=  5  mM.  Fifty-thousand  sequences  that  fit  the  constraints  were  incorporated  onto  the 
1X1M  microarray  chip  with  a  Ti0  spacer  in  replicates  of  4. 

3.5  1X1M  Microarray 

The  basic  protocol  was  the  same  for  the  1X1 M  arrays  as  for  the  8X1 5k  arrays  described  above. 
The  only  difference  in  methodology  was  that  the  gasket  slide  held  a  750  pL  volume  instead  of 
70  pL. 


3.6  Indirect  Labeling  Method-  IgE  Biotinylation 

Biotinylation  of  IgE  was  carried  out  according  to  the  manufacturer’s  instructions,  and  purified 
using  the  provided  spin  column.  The  biotin/protein  ratio  was  calculated  to  be  0.6  according  to 
the  HABA/Avidin  test. 

3.7  8X15k  and  1X1M  Microarrays  (Indirect  Labeling  Method) 

The  8X1 5k  and  IX 1M  microarrays  were  analyzed  in  the  same  manner  as  the  direct  labeling 
method  with  some  exceptions.  Following  incubation  with  biotin-IgE,  slides  were  disassembled 
in  PBSMTB  buffer  and  rinsed  for  3  min  on  a  shaker.  Slides  were  transferred  to  PBSM  buffer 
for  3  min  also  on  a  shaker.  Then  0.5,  10,  or  500  nM  Cy3-streptavidin  (Cy3-SA)  was  loaded 
into  the  gaskets  and  incubated  for  2  min.  The  washing  steps  were  the  same  as  the  direct 
labeling  method. 

3.8  8X15k  Arrays 

The  standard  control  array  was  performed  as  in  Section  3.3,  except  the  samples  were  variable: 
100  nM  samples:  Cy3-IgE  D/P=  2.3,  4.8,  and  11,  Cy5,  Cy3-BSA,  TurboRFP;  10  nM  samples: 
Cy3-SA  (to  mimic  the  biotinylated  experiments),  Cy3-estradiol  antibody  (the  sample  was 
dilute  following  purification).  The  extended  control  array  was  also  similar  to  Section  3.3, 
except  all  Cy3  only  arrays  were  performed  with  50  nM  sample,  and  all  Cy3-proteins  were  at 
1 00  nM.  When  the  buffers  were  varied,  the  arrays  were  incubated  in  the  buffer  of  interest 
before  sample  loading,  and  slides  were  disassembled  in  water  then  quickly  transferred  (3 
seconds)  to  PBSM  for  the  subsequent  washing  steps  so  the  PBSM  wouldn’t  cause  DNA 
folding  with  the  samples  remaining  on  the  arrays. 

For  the  final  D/P  analysis,  100  pL  sample  was  dialyzed  for  48  hours  through  a  3.5k  MWCO 
dialysis  cassette.  Dialysate  was  MilliQ  water  exchanged  after  2  and  4  hours.  Dye  to  protein 
ratio  (D/P)  was  calculated  using  the  manufacturers’  instructions  with  the  aid  of  Nanodrop 
UY/Yis  detection. 
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3.9  Data  Analysis 

The  arrays  were  scanned  using  Agilent  Scan  Control  software.  Images  (TIFF)  were  generated 
using  20-bit  imaging  at  3  pm  (for  1X1M  arrays)  or  5  gin  (8X15k  arrays).  Data  was  extracted 
using  Agilent  Feature  Extraction  software  version  10.7.3.1.  Mean  fluorescence  intensity  and 
other  values  of  interest  of  replicates  were  determined  using  code  written  in  Perl. 

3.10  Surface  Plasmon  Resonance  (SPR) 

Biacore  studies  were  carried  out  on  a  CM7  chip  with  neutravidin  custom  immobilized  on  the 
surface.  The  surface  was  activated  with  a  mixture  of  l-ethyl-3-[3- 

d  i  in  ethyl  am  i  n  opropy  1  ]  c  arb  o  d  i  i  m  i  de  (0.2  M)  and  N-hydroxysuccinimide  (0.05  M)  for  420  sec  at 
10  gL/min.  Neutravidin  (10  mg/mL)  was  dissolved  in  HyClone  water  then  diluted  1/10  in  10 
mM  sodium  acetate  buffer  (pH=  4.5)  and  added  at  30  gL/min  for  210  sec.  The  surface  was 
blocked  with  1  M  ethanolmine  for  600  sec  at  5  gL/min.  Biotinylated  4A018  aptamer  was 
heated  to  95°C,  then  introduced  at  5  gM  in  5  mM  MgSCL  for  180  sec  at  30  gL/min.  Samples 
were  diluted  to  27.3  gM  in  HEPES  buffer  from  Biacore  (lOmM  HEPES,  150mM  NaCl,  0.05% 
Surfactant  P20)  and  dialyzed  into  HEPES  buffer  overnight.  The  samples  were  serially  diluted 
to  appropriate  concentrations,  then  introduced  onto  the  chip  at  30  gL/min  for  30  sec.  Data 
analysis  was  performed  with  BIAevaluation  software. 
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4.0 


RESULTS  AND  DISCUSSION 


4.1  Starting  Library  Design 

As  described  above,  a  major  challenge  of  using  microarrays  for  selection  is  the  limited  surface 
density  of  the  arrays  (106)  compared  to  SELEX  (1015).  Therefore,  we  designed  a  starting 
library  populated  with  sequences  that  have  a  higher  probability  of  binding  than  a  random 
library.  Again,  the  Liu  pattern  was  adapted  from  literature,  PT1  was  designed  to  maximize  the 
number  of  junctions,  PT2  is  a  variation  on  the  Liu  pattern  that  allows  for  internal  and 
peripheral  base  repeats,  and  the  PT3  variation  allows  for  base  repeats  at  the  periphery  only. 

The  selection  rate,  or  number  of  sequences  analyzed  until  one  fits  the  constraints,  was  highest 
for  the  Liu  pattern  (Fig.  4A),  but  the  structural  diversity  was  highest  for  PT3  (Fig.  4B).  This 
shows  that  the  Liu  pattern  has  the  least  number  of  sequences  in  the  starting  pool  that  fit  the 
criteria,  and  the  structures  formed  are  less  diverse  than  most  of  the  less  stringent  patterns.  This 
could  mean  that  the  Liu  pattern  will  not  generate  as  many  binders,  or  it  could  mean  that  the 
stringency  of  the  pattern  actually  evolves  higher  affinity  sequences.  In  contrast,  PT3  generated 
sequences  that  fit  into  the  constraints  more  frequently,  but  it  also  generated  the  most  diverse 
structures.  More  complex  and  diverse  structures  in  an  initial  pool  has  been  suggested  to  lead  to 
better  likelihood  of  identifying  a  high-affinity  aptamer25,27,  so  these  results  indicated  that  PT3 
may  be  the  “ideal”  pattern  of  those  studied  under  these  conditions  based  on  a  simplified  in 
silico  analysis. 
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Figure  4:  Starting  Library  Design 
4.2  Direct  Labeling  Method 

IgE  was  labeled  with  Cy3  (D/P=  1 1.8)  according  to  standard  methods  to  facilitate  detection  by 
the  scanner.  The  IgE  positive  and  negative  controls  were  used  to  optimize  the  method  and 
[Cy3-IgE]  in  8X1 5k  array  format  prior  to  aptamer  identification  from  the  library  (Fig  5A).  The 
positive  controls  showed  a  clear  increasing  response  with  increasing  linker  length  and  [Cy3- 
IgE]  (Fig  5B),  while  the  negative  control  intensity  (SA=  mutated  streptavidin  aptamer)  did  not 
increase  with  linker  length,  but  showed  a  slight  background  increase  with  increasing  [Cy3-IgE] 
(Fig  5C).  We  also  saw  that  the  identity  of  the  linker,  A,  C,  G,  or  T,  will  affect  the  fluorescence 
intensity  (Fig  5D),  with  T  linker  the  most  consistent  performer  across  various  linker  lengths. 

An  interesting  phenomenon  occurred  when  IQ  was  plotted  against  linker  length  (Fig  5E).  The 
IQ  values  are  more  variable  at  low  linker  lengths,  then  appear  to  plateau  at  a  certain  length. 

This  also  corresponds  to  an  increase  in  the  “goodness  of  fit”  of  the  data  (Fig  5F),  suggesting 
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that  a  minimal  linker  length  is  required  for  accurate  data.  At  short  lengths,  the  Cy3-IgE  may  be 
better  able  to  interact  nonspecifically  with  the  surface  or  BSA,  or  other  components  of  the 
buffer,  or  the  aptamer  is  too  rigidly  confined  to  fold  into  the  proper  binding  structure.  The 
negative  control,  D17-4SM17,  appears  to  have  the  best  K<j,  but  the  fit  of  the  data  is  variable 
across  lengths,  and  never  reaches  acceptable  limits  (>0.95).  The  curves  do  not  reach 
saturation,  which  affects  the  error  associated  with  the  curve  fit  of  the  data.  Based  on  the 
results,  100  nM  Cy3-IgE  was  applied  to  the  1X1 M  feature  selection  array  (Fig  5G),  and  several 
probes  were  identified  demonstrating  comparable  fluorescence  to  the  positive  controls.  These 
top  binding  probes  were  sent  for  BIACore  (Biosensor  Tools,  Salt  Lake  City,  UT)  testing,  but 
did  not  demonstrate  any  binding  activity  (Fig  5H).  A  similar  array  protocol  was  then  used  with 
Cy3  dye  alone.  The  top  100  ranked  (mean  fluorescence  intensity)  Cy3  binders  were  cross- 
referenced  with  the  top  500  Cy3-IgE  binders  (Table  1).  The  results  showed  that  the  top  Cy3- 
IgE  binders  were  more  likely  binding  the  dye  than  the  IgE,  which  accounts  for  the  lack  of 
BIACore  activity  (Table  1,  bold  probes). 
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Figure  5:  Results  from  Direct  Labeling  Method 
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Table  1:  Cy3-IgE  Fluorescence  Intensity  Rank  versus  Cy3  Dye  Rank 


IgE  Rank 

Probe  ID 

Cy3  Rank 

344 

5D837 

95 

60 

4L137 

91 

149 

6L573 

90 

15 

1A411 

87 

54 

2D539 

85 

32 

0L574 

79 

59 

5E589 

78 

150 

9M745 

75 

43 

8G954 

73 

155 

6E768 

63 

63 

90285 

55 

30 

20166 

53 

48 

5A255 

44 

56 

3M083 

42 

33 

8L716 

39 

24 

9N665 

32 

12 

2K864 

24 

35 

50660 

23 

23 

9J881 

19 

26 

4H813 

16 

21 

9B301 

13 

27 

2K027 

12 

71 

3M814 

11 

7 

01505 

10 

16 

5K476 

9 

36 

1B194 

8 

17 

1B463 

4 

6 

1M969 

3 

10 

8N621 

2 

20 

4L969 

1 

15 

1A411 

87 

4 
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69776 

3 

0S861 

1257 

2 
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1 

5K296 

1894 
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4.3 


Indirect  Labeling  Method 


A  new  scheme  was  developed  for  microarray  work  reported  in  the  literature  to  decrease  the 
amount  of  nonspecific  binding.  In  the  method,  the  target  is  biotinylated  and  incubated  with  the 
array  as  in  the  previous  method  (Fig  2).  However,  after  this  step,  a  short  Cy3-SA  incubation  is 
performed  before  data  extraction.  Two  optimization  experiments  (Fig  6A-B)  were  carried  out 
similar  to  the  Cy3-IgE  experiments  discussed  above.  It  was  clear  that  binding  was  due  to  the 
biotin-IgE  instead  of  the  Cy3-SA  by  visual  comparison  of  the  arrays  (Fig  6C-D).  Some 
minimal  binding  to  the  positive  IgE  controls  was  observed  in  the  Cy3-SA  only  array,  but  this  is 
believed  to  be  a  function  of  the  communal  washing  procedure  causing  biotin-IgE  availability 
rather  than  Cy3-SA  binding.  The  signal  was  orders  of  magnitude  higher  when  10  nM  biotin- 
IgE  was  included,  but  the  negative  control  values  did  not  change  significantly  (see  data  in 
Tables  Al-2;  blue  designates  negative  controls,  green  designates  positive  controls).  The  raw 
signal  for  similar  [IgE=10  nM]  was  enhanced  750X  in  the  biotin  labeling  method  (Fig  6F) 
compared  to  the  direct  (Cy3-IgE)  labeling  method,  and  the  LOD  was  improved  from  -500  pM 
(direct)  to  -4  pM  (indirect).  Figure  6G  shows  a  maximum  S/N  enhancement  of  -700X  at 
[biotin-IgE]=  25  nM  from  the  indirect  method  compared  to  the  direct  dye-IgE  method  (Fig 
6H).  We  selected  [biotin- IgE]=  25  nM,  [Cy3-SA]=  10  nM  for  the  aptamer  identification 
experiment  conditions  due  to  the  conditions  displaying  the  highest  S/N.  These  results  prove 
that  the  indirect  labeling  method  has  enhanced  signal  and  lower  nonspecific  binding  to 
negative  controls.  We  are  in  the  process  of  testing  lower  D/P  in  order  to  rule  out  binding  site 
occlusion  or  resonance  energy  transfer  as  the  source  of  lower  signal  in  the  direct  labeling 
method32. 

Two  1X1M  aptamer  identification  experiments  with  the  biotin-IgE  were  performed  with 
different  biotin-IgE  concentrations  to  rule  out  experimental  artifacts  (Fig  7A-B).  See  Figure 
A1  (Appendix)  for  actual  images  from  each  location.  The  patterned  DNA  was  ranked  by  mean 
fluorescence  intensity  in  each  experiment,  and  compared  with  how  the  same  sequences  ranked 
in  the  Cy3  only  and  Cy3-IgE  trials  (Fig  1C).  (Blue=  PT1;  Red=  Liu;  Green=  PT2;  Purple= 
PT3;  black  boxes  indicate  sequences  that  appear  to  be  good  biotin  and/or  Cy3  binders.)  Many 
of  the  top  biotin-IgE  binders  were  also  top  Cy3  and/or  Cy3-IgE  binders,  indicating  that  these 
sequences  are  probably  binding  the  small  molecule  tags  rather  than  the  IgE  protein.  By 
expanding  the  search  parameters  past  the  top  25  biotin-IgE  binders,  we  were  able  to  identify 
two  top  candidates  (9D551  and  6K736)  that  show  high  ranking  biotin-IgE  binding,  but  poor 
Cy3  binding  (Fig  7D).  Although  the  fluorescence  intensity  of  these  probes  is  not  nearly  as 
high  as  the  positive  controls,  they  may  represent  new  IgE  binding  motifs,  and  will  be  tested  for 
binding  by  surface  plasmon  resonance  (SPR). 
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Figure  6:  Results  from  Indirect  labeling  Method  Optimization 
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Figure  7:  Indirect  labeling  Method  Ap tamer  Identification  Results 

We  also  compared  the  sequences  on  a  number  of  parameters  to  determine  if  any  trends  were 
present  which  could  indicate  the  best  options  for  SPR  binding  testing.  After  analyzing 
parameters  similar  to  Figure  7D  for  a  larger  population  of  sequences  (high/low  ranked  dye, 
Cy3-IgE,  or  biotin-IgE  binders),  some  clear  trends  emerged  for  some  of  the  conditions.  PT2 
sequences  tended  to  be  good  biotin  and  dye  binders,  holding  ~  1/2  or  1/3  of  the  top  25 
fluorescence  intensities  for  the  two  tags.  G-quartet  structures  are  known  to  bind  small  planar 
or  aromatic  molecules,  meaning  PT2  sequences  forming  these  structures  may  be  binding  the 
tags  instead  of  the  protein  ’  ’  .  After  analysis  of  pattern  design,  we  observed  that  PT2  has  a 
higher  probability  of  forming  G-quartet  structures  due  to  the  potential  to  contain  the  highest 
number  (7)  of  G  repeats,  three  of  which  are  R  groups  with  a  50%  chance  of  containing  G. 

Also,  a  “wand”  secondary  structure  was  indicative  of  good  B-IgE  binders  and  poor  dye  and 
dye-IgE  binders,  making  sequences  that  form  these  structure  good  candidates  for  IgE  binding 
(Fig  8).  In  contrast,  a  “Y”  secondary  structure  binds  all  categories  well  (biotin-IgE,  dye,  and 
dye-IgE),  indicating  dye/biotin  interactions  may  be  causing  the  signal.  It  agrees  with  prior 
reports  that  the  more  complex  Y  structure  forms  more  pockets  for  small  molecule  binding  than 
a  stem/loop  wand  structure  .  Also,  PT1  appeared  in  all  good  and  poor  dye  and  B-IgE  binders, 
P2  bound  both  dye  and  B-IgE  sequences  well  and  contained  the  most  G  repeat  sets  in  the 
structures  studied,  and  P3  tended  to  be  good  B-IgE  binders  and  poor  dye/dye-IgE  binders.  The 
Liu  pattern  did  not  have  enough  representatives  in  any  category  to  distinguish  any  trends. 
Sequences  beginning  with  GGTTGG(CC  or  TT)  were  frequently  observed  among  top  dye  and 
biotin  binders  (similar  to  the  first  several  bases  of  the  GQ-forming  TFBS).  Sequence  6N686 
begins  with  this  motif  and  did  not  bind  IgE  in  the  SPR  studies.  Therefore,  a  wand  structure  of 
P3  with  few  G  repeat  sets  and  minimal  dye  binding  would  be  an  ideal  candidate,  although  none 
of  the  sequences  in  Figure  7D  perfectly  fit  this  mold.  The  trends  only  serve  as  generalizations, 
and  a  broader  picture  of  individual  sequences  should  be  obtained  before  summarily  ruling 
sequences  out  for  reasons  other  than  known  dye  binding. 
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4.4  G-Quartet  Aptamer  Studies 

An  interesting  observation  noted  in  all  testing  formats  was  that  the  thrombin  fibrinogen 
binding  site  aptamer  (TFBS)  was  demonstrating  significant  fluorescence  signal.  In  the  Cy3- 
IgE  experiments,  the  TFBS  signal  was  several  times  higher  than  the  signal  for  the  positive  IgE 
controls  (Fig  9A).  The  TFBS  was  the  only  control  to  fluoresce  above  background  in  the  Cy3 
only  trial,  showing  that  it  is  not  binding  the  protein  (Fig  9B).  TFBS  also  displayed  binding  in 
the  biotin-IgE  tests  (Fig  9C),  although  the  binding  was  reduced  in  comparison  to  the  Cy3-IgE. 
This  may  be  due  to  a  higher  D/P  in  the  Cy3-IgE  experiments  relative  to  the  biotin-IgE 
experiments,  or  a  decreased  affinity  for  the  planar,  but  not  aromatic  biotin.  The  observation 
aids  in  validating  the  indirect  labeling  method  as  demonstrating  lower  nonspecific  binding. 
Based  on  these  results,  the  controls  were  extended  in  the  next  biotin-IgE  array  design  (Fig  9D- 
F).  Figure  9D  shows  the  response  of  TFBS  with  25  nM  IgE  and  10  nM  Cy3-SA  with  different 
linker  identities.  The  A  and  T  linkers  bind  similarly,  but  the  C  linker  increases  then  sharply 
decreases  at  LL=10,  and  the  G  linker  does  not  show  any  change  across  linker  length.  The  C 
linker  probably  is  increasing  in  fluorescence  as  the  probe  is  moved  further  from  the  array 
surface,  but  decreases  when  it  becomes  long  enough  to  interact  with  the  G-quartets  and  disrupt 
binding.  Adding  a  G-linker  appears  to  change  or  destroy  the  structure  of  the  quartet  so  binding 
does  not  improve  at  any  linker  length.  This  further  shows  the  importance  of  the  G-quartet  for 
binding  small  molecules.  In  Figure  9E,  minimal  binding  is  observed  for  any  linker  identity 
after  a  short  incubation  with  10  nM  Cy3  only.  This  shows  that  TFBS  can  also  bind  biotin,  and 
biotin  is  the  cause  of  the  enhanced  fluorescence  in  the  biotin-IgE  format.  The  riboflavin 
binding  aptamers  (RB),  also  G-quartets,  were  incorporated  into  this  extended  design.  Figure 
9F  shows  that  RB  3-tiered  (RB3)  binds  biotin  above  background  (10  nM  biotin-IgE  +10  nM 
Cy3-SA),  but  not  as  well  as  TFBS.  The  2-tiered  RB  (RB2)  did  not  show  any  appreciable 
binding.  The  diminished  binding  of  the  RB’s  is  due  to  their  reported  intolerance  to  high 
sodium  concentration,  more  severe  for  RB2  than  RB3,  while  the  effect  has  not  been  reported 
for  TFBS34.  Also,  these  studies  were  performed  in  the  IgE  aptamer  binding  buffer,  which  can 
alter  the  folding  of  aptamers  not  selected  under  these  conditions.  An  enhanced  effect  for  both 
Cy3  and  biotin  binding  would  likely  be  shown  for  TFBS  and  the  RB’s  in  their  respective 
selection  buffers.  Regardless,  microarrays  may  serve  as  an  initial  test  or  even  alternative  to 
NMR  and  crystallography  studies  to  confirm  aptamer  G-quartet  structure  4 . 
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Figure  8:  Secondary  Structures  of  Potential  IgE  Binders 
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Figure  9:  G-Quartet  Binding  Data 
4.5  Elucidation  of  Biotin  and/or  Cy3  Aptamers 

When  comparing  individual  probe  performance  across  Cy3-IgE,  Cy3  only,  and  biotin-IgE 
trials  similar  to  Figure  7,  several  sequences  were  unearthed  that  may  warrant  further  study. 
Figure  10  depicts  a  number  of  probes  that  may  bind  to  both  biotin  and  SA,  biotin  only,  dye 
only,  or  unbound  dye  only  based  on  the  mean  fluorescence  intensity  rank  for  each  experiment. 
It  is  undetermined  whether  the  probes  would  bind  other  similar  dyes  or  tags,  or  other  dye- 
labeled  proteins,  or  function  in  other  buffers  at  this  time.  They  may  even  be  G-quartet  or  other 
higher  order  structures  that  bind  small  molecules  somewhat  indiscriminately.  The  probes 
could  be  utilized  in  any  way  an  aptamer  could  be  applied,  including  serving  as  positive 
controls  in  future  microarray  work.  Aptamers  could  be  generated  for  other  similar  small 
molecule  targets  by  constructing  an  in  vitro  library  of  analogous  DNA  structures  to  act  as  part 
of  a  dye  displacement  assay  for  “label-free”  target  detection. 
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Both 

Biotin 


Figure  10:  Potential  Biotin  and/or  Cy3  Aptamers 

4.6  Interaction  with  G-quartets  (GQ),  New  Thrombin  Aptamers,  and  Other  Tests 

sheeThe  final  two  microarrays  were  employed  to  answer  questions  related  to  how  the  samples 
interact  with  G-quartet  aptamers,  and  whether  or  not  the  D/P  (dye  to  protein  ratio)  is  affecting 
the  intensity  of  the  Cy3-IgE/IgE  aptamer  interaction.  One  goal,  as  previously  stated,  is  to 
propose  that  micorarrays  can  be  used  as  a  simpler  method  of  screening  ligands  for  G-quartet 
folding  by  simply  adding  a  planar,  aromatic  fluorescent  dye  such  as  Cy3.  Previous  relevant 
findings  are  summarized  in  Figure  1 1  and  include:  1)  Cy3-IgE  has  a  much  lower  S/N  (signal- 
to-noise  ratio)  than  labeling  the  protein  with  biotin,  then  adding  Cy3-SA  (streptavidin)  reporter 
(Fig  1 1  A,  same  as  Fig  6F).  We  were  not  sure  if  this  was  due  to  an  actual  benefit  of  the 
method,  or  because  of  binding  site  occlusion  and/or  FRET  (fluorescence  resonance  energy 
transfer)  from  the  high  D/P  (~1 1).  2)  Probe  17-1,  although  reported  as  a  positive  control, 
demonstrated  ~10X  lower  fluorescence  intensity  overall,  and  was  basically  unresponsive  at 
linker  length  <10  at  100  nM  Cy3-IgE  (Fig  1  IB).  3)  Cy3  alone,  Cy3-IgE,  and  biotin-IgE  (lower 
extent)  all  bound  to  the  TFBS  (thrombin  fibrinogen  binding  site)  G-quartet  (GQ)  aptamer. 
Cy3-SA  did  exhibit  -7X  above  background  levels  TFBS,  but  it  did  not  increase  dramatically 
(over  25X  background  for  others)  across  increasing  linker  length  (enhanced  availability)  like 
the  other  samples  (Figs  1 1C-E).  (Cy3-SA  also  had  very  low  backgrounds  compared  to  the 
other  samples.)  Since  IgE  has  been  confirmed  in  the  literature  to  not  interact  with  TFBS,  we 
assumed  the  Cy3  and  biotin  were  the  cause  of  binding  since  they  are  small  and  planar  and/or 
aromatic.  If  this  was  the  case,  why  wasn’t  Cy3-SA  also  binding  the  TFBS? 
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4)  Cy3  alone  and  Cy3-IgE  showed  an  increase  in  intensity  with  linker  length  for  THBS  which 
was  much  smaller  in  magnitude  than  TFBS  increase.  Biotin-IgE  and  Cy3-SA  had  a  higher 
THBS  value  than  background  (~2X),  but  it  did  not  increase  across  linker  length  (Figs  1 1F-G). 
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Figure  11:  Summary  of  Pertinent  Previous  Work 

Figure  12A-C  shows  the  reference  key  for  the  arrays,  and  the  questions  asked  by  the  samples  of 
the  standard  control  array  and  the  extended  control  array.  The  extended  control  array  had  some 
additional  GQ-forming  aptamers  added  to  it,  specifically  the  2  and  3-tiered  riboflavin  aptamers 
(RB2  and  RB3).  If  Cy3  can  interact  with  GQ’s,  we  should  see  a  positive  result  for  the  RB’s,  and 
a  negative  result  for  non-GQ  forming  structures.  Figure  12D  is  a  table  representative  of  the 
different  reasons  we  felt  streptavidin  may  not  be  binding  the  TFBS  in  the  same  manner  as  Cy3 
and  Cy3-IgE.  Other  proteins,  BSA,  thrombin,  IgE,  Estradiol  antibody,  and  a  scrambled 
neuropeptide  Y  (NPY)  sequence  were  evaluated  to  confirm  or  rule  out  factors  such  as  MW,  pi, 
D/P,  glycosylation,  and  NaAz  (sodium  azide)  interference.  The  two  microarray  experiments  will 
need  to  be  compared  to  each  other  for  a  full  picture  of  the  relevant  parameters. 

The  most  straightforward  set  of  comparisons  involve  evaluating  different  D/P  on  IgE  to 
determine  whether  binding  site  occlusion/FRET/protein  quenching  is  resulting  in  the  decreased 
S/N  for  the  IgE  aptamers  compared  to  the  biotin-IgE/Cy3-SA  method.  High  protein  labeling  has 
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been  shown  to  both  interfere  with  binding  sites  and  cause  FRET  which  lowers  fluorescence 
signal32.  Cy3  and  biotin  have  not  been  seen  to  interact  with  the  IgE  aptamers  (non  GQ’s)  above 
background,  so  binding  is  dictated  by  the  protein  properties  (FRET,  binding  site  availability) 
rather  than  more  dye  reporting  more  intense  binding.  We  produced  Cy3-IgE  D/P=  2.3  and  4.8  in 
addition  to  1 1  prepared  before.  The  D/P  was  modulated  by  changing  the  amount  of  dye  added  to 
a  constant  amount  of  IgE.  D/P=  2.3  gave  higher  fluorescence  intensity  values  than  D/P=l  1  and 

4.8  for  the  positive  controls  (Fig  13A-C,  compare  arrays  (2.3)  1 _ 1,  (1 1)  12,  and  (4.8)  2_1  in 

legends).  Interestingly,  clear  binding  was  observed  with  probe  D17-1  at  D/P=  2.3,  which  only 
showed  minimal  binding  in  past  experiments  with  D/P  1 1 .  Binding  is  more  easily  observed  due 
to  some  relation  between  decreased  FRET  and/or  increased  site  availability.  D/P=2.3  also 
displayed  lower  negative  control  signal  than  D/P=l  1  (Figs  13D-F).  The  S/N  values  for  D/P  2.3, 
4.8,  and  1 1  were  186.6,  4.9,  and  5.3  vs.  over  900  for  the  biotin-IgE/Cy3-SA  experiments  at  the 
same  [IgE],  Figure  13F  validates  mutated  streptavidin  aptamer  shown  in  previous  experiments 
to  display  very  low  background.  In  conclusion  to  point  1,  we  were  able  to  improve  S/N  by 
manipulating  the  D/P  of  CyS-SA.  The  enhancement  for  the  biotin-IgE  method  may  be  due  to 
the  multiple  Cy3  present  on  the  Cy3-SA  (~6)  reporter,  or  the  low  biotin/protein  (0.8)  providing 
more  binding  site  availability.  Although  we  could  further  decrease  the  D/P  of  Cy3-IgE  to  ~0.8, 
it  is  unlikely  the  enhancement  would  be  as  much  the  biotin-IgE/Cy3-SA  because  any  increased 
binding  site  availability  would  be  counteracted  by  more  of  the  binding  proteins  without  a  Cy3 
reporter.  Regardless,  the  S/N  enhancement  from  the  biotin-IgE/Cy3  IgE  method  makes  this 
method  superior  to  the  Cy3-IgE  method. 

At  the  completion  of  the  experiment,  after  ruling  out  all  of  the  possibilities  in  Figure  12D  as  the 
cause  of  the  4  questions  addressed  above,  sample  purity  from  free  dye  was  identified  as  a  culprit. 
The  three  Cy3-IgE  D/P’s  were  dialyzed  through  a  3.5k  membrane,  and  absorbance  values  were 
taken  following  dialysis.  A  large  amount  of  free  dye  was  initially  present  in  the  D/P  2.3  and  1 1 
samples,  because  the  D/P  after  dialysis  was  0.5  (2.3)  and  2.5  (1 1).  D/P  4.8  seemed  more  pure 
than  the  other  two  samples,  with  a  final  D/P=  4.  This  explains  why  initial  D/P=  4.8  showed  the 
lowest  fluorescence  intensity  values,  as  it  actually  had  the  highest  D/P  once  the  samples  were 
extensively  purified.  This  correlates  with  work  by  Hahn  on  IgG  that  concludes  that  higher  D/P 
will  reduce  the  fluorescence  yield  due  to  resonance  energy  transfer  >6  D/P32.  Possibly  the 
threshold  is  slightly  lower  for  IgE  protein.  The  conclusion  is  that  the  manufacturer’s  protocol 
for  free  dye  separation  is  not  sufficient  for  aptamer  microarray  work  because  DNA  (especially 
GQ’s)  can  interact  with  free  dye  present,  resulting  in  false  positives  for  binding.  Previous 
aptamer  array  work  was  less  concerned  with  this  because  they  were  either  selecting  for  an 
intrinsically  fluorescent  molecule,  or  because  the  main  aim  was  for  detection  of  fluorescent- 
labeled  targets  rather  than  selecting  new  aptamers14’16’17’19’22.  New  aptamers  were  selected  for 
Cy3 -thrombin20  but  thrombin  binds  GQ  structures,  and  probably  with  a  much  higher  affinity  than 
that  of  the  dye  for  the  GQ.  So  the  specific  thrombin  interaction  would  likely  take  precedence 
over  the  lower  affinity  dye  binding. 
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Figure  12:  Microarray  Experimental  Setup 
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The  remainder  of  the  work  for  the  first  array  will  examine  how  the  various  samples  interact  with 
TFBS,  THBS,  RB2,  and  RB3  to  answer  the  questions  posed  in  Figure  12,  either  confirming  that 
GQ’s  can  interact  with  dyes,  or  attempting  to  explain  why  the  Cy3-SA  does  not  interact  with 
TFBS  in  the  same  manner  as  Cy3-IgE.  First,  we  tested  whether  Cy5  dye  would  bind  to  the  G- 
quartet  forming  TFBS  aptamer,  since  prior  work  showed  that  Cy3  dye  (both  alone  and  including 
Cy3-IgE)  could  bind  TFBS  (Fig  1 1C-E).  Cy5  requires  scanning  in  the  red  channel,  where  other 
arrays  were  scanned  using  the  green  channel  specific  for  Cy3.  Figure  14A  clearly  shows  binding 
of  the  Cy5  dye,  confirming  that  G-quartet  structures  can  bind  Cy5  as  well  as  Cy3  dye  and 
biotin  under  the  given  conditions.  Figure  14B  displays  the  green  channel  binding  of  the 
samples  to  TFBS  (note  Cy5  displays  minimal  signal  in  green  channel,  2_2).  All  arrays  were 
above  the  SA  background  (Fig  13F),  but  the  IgE’s  and  Cy3-BSA  show  an  increasing  trend 
similar  to  the  free  dye  in  Figure  14 A.  The  Cy3-streptavidin  sample  was  dialyzed  because 
previous  results  showed  that  Cy3-streptavidin  did  bind  TFBS  above  background,  but  did  not 
increase  with  increasing  linker  length  like  the  free  dye  and  Cy3-protein  samples.  If  the  dye  was 
the  result  of  TFBS  signal,  Cy3-streptavidin  should  bind  as  well,  and  we  speculated  it  may  have 
been  because  of  sodium  azide  in  the  sample.  The  dialyzed  Cy3-SA  (commercially  ordered)  has 
elevated  fluorescence  (similar  to  previous  results),  but  not  increasing  intensity  across  linker 
length.  Thus  removal  of  NaAz  did  not  improve  binding. 

Cy3-BSA  did  bind  TFBS  and  increase  across  linker  length,  and  it  is  similar  in  MW  and  pi  to 
Cy3-SA  and  not  glycosylated,  effectively  ruling  these  out  as  the  cause  of  minimal  Cy3-SA 
interaction.  The  binding  profile  of  Cy3-Estradiol  antibody  (Fig  14D)  is  remarkably  similar  to 
dialyzed  Cy3-SA.  These  were  the  only  two  proteins  to  be  analyzed  at  10  nM  rather  than  100 
nM:  1)  in  order  to  rule  out  10  nM  Cy3-SA  as  the  binding  agent  in  the  biotin-IgE/Cy3-SA 
experiments,  and  2)  Purification  of  Cy3-Estradiol  antibody  yielded  an  extremely  dilute  sample. 
The  low  concentration  of  the  two  samples  may  be  the  cause  of  the  abnormal  binding  effects; 
however  it  also  exposed  purity  from  free  dye  as  the  explanation  (as  discussed  above).  Cy3-SA  is 
highly  purified  by  the  manufacturer,  and  Cy3-estradiol  antibody  (and  Cy3-IgE  D/P=  2.3)  was 
purified  on  a  longer  purification  column  than  Cy3-BSA,  Cy3-IgE  D/P=  11,  and  Cy3-IgE  D/P= 
4.8.  A  longer  separation  column  would  be  expected  to  produce  better  resolved  Cy3-protein  and 
Cy3  dye  bands.  Cy3-IgE  D/P=  4.8  may  potentially  be  purified  better  than  Cy3-IgE  D/P=l  1 
despite  using  a  similar  column,  otherwise  we  would  expect  the  free  dye  from  the  former  to  bind 
TFBS  in  a  similar  intensity  if  free  dye  were  the  only  factor.  More  likely,  there  is  probably  some 
interplay  going  on  between  free  dye  affinity  v.  protein-bound  dye  affinity  v.  D/P  v.  quenching. 

It  appears  that  TFBS  binds  free  dye  with  a  higher  affinity  than  protein-bound  dye,  likely 
because  of  steric  crowding  or  accessibility  issues  when  the  protein  is  involved.  Finally,  it  is 
unclear  why  TurboRFP  demonstrates  enhanced  TFBS  and  THBS  signal.  It  could  be  related  to  a 
real  response  of  the  protein  for  the  GQ  binder  or,  it  may  be  due  to  sample  bleed  over  from  Cy3- 
IgE  D/P=  1 1  array  during  the  washing  process. 
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Figure  13:  Interaction  of  the  Positive  (left)  and  Negative  (right)  Controls  with  Cy3-IgE  at 

Different  D/P 

Figure  15  summarizes  the  conclusions  from  the  questions  we  aimed  to  answer  with  the  Standard 
Control  experiments.  We  did  find  that  lower  Cy3-IgE  D/P  gave  fluorescence  enhancement,  but 
still  significantly  lower  than  the  biotin-IgE/Cy3-SA  method.  Low  D/P  enhanced  the  signal 
observed  from  the  weaker  binder  17-1.  Cy5  dye  can  bind  TFBS  in  addition  to  Cy3  dye, 
strengthening  the  case  that  dyes  can  be  used  to  probe  GQ  structure  on  microarrays.  Finally,  we 
ruled  out  MW,  pi,  glycosylation,  and  NaAz  as  the  cause  of  the  decreased  Cy3-SA  binding 
profile,  while  adding  purity  from  free  dye  as  the  main  contributing  factor.  Future  work  aims  at 
further  characterization  of  these  interactions,  and  of  interactions  with  other  known  aptamers. 
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Figure  14:  Interaction  of  the  Samples  with  TFBS  and/or  THBS 
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Figure  15:  Pictorial  Summary  of  Results 
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In  the  extended  controls,  the  riboflavin  2-  or  3 -tiered  GQ  aptamers  were  included  to  see  if  the 
dye  could  bind  other  GQ’s,  and  to  probe  some  different  aspects  of  GQ  folding/function.  In 
Figure  16A  we  can  see  CyS  can  bind  RB3  similar  to  (slightly  better  than)  TFBS,  confirming 
that  the  GQ  structure  is  needed  for  dye  binding.  Note  that  the  RB2  aptamer  does  not  bind, 
which  is  logical  because  it  was  reported34  that  RB2  is  less  stable  than  RB3  in  a  Na+  based  buffer, 
and  requires  a  higher  K+  content  for  structural  stability.  In  Figure  16B,  the  data  in  Figure  16A 
for  TFBS  binding  of  a  new  dye  pack  was  compared  to  the  binding  of  a  dye  pack  stored  in  water 
for  several  months.  The  “old”  dye  pack  was  expected  to  hydrolyze  rapidly,  wherein  the  N- 
Hydroxysuccinimide  (NHS)  moiety  of  the  dye  would  be  removed  and  the  dye  would  now 
terminate  in  a  carboxylic  acid  group.  Interestingly,  the  “old”  dye  pack  did  not  bind  the  TFBS 
GQ  as  strongly  as  the  new  pack,  indicating  that  the  NHS  moiety  may  aid  in  binding.  RB3 
binding  (not  shown)  was  also  similarly  diminished  but  above  background.  The  samples  were 
diluted  to  different  concentrations  and  analyzed  in  a  fluorescence  plate  reader  for  Figure  16C. 
The  fluorescence  binding  curves  were  similar  for  the  two  samples,  showing  that  dye 
photobleaching  was  not  the  cause  of  the  lower  apparent  binding.  We  are  looking  into  methods  to 
evaluate  the  structure  of  the  old  dye  pack  to  confirm  whether  the  NHS  moiety  has  been 
hydrolyzed.  Another  experiment  involved  mixed  Cy3  and  biotin  together  to  see  if  biotin  could 
compete  with  Cy3  for  binding  of  the  GQ’s  (Fig  16D).  The  fluorescence  intensity  values  were 
not  significantly  diminished  by  biotin,  but  the  Cy3  may  have  much  higher  affinity  for  the  DNA 
since  it  is  both  a  planar  and  aromatic  compound. 
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Figure  16:  Cy3  Binding  to  Multiple  GQs 
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Cy3-IgE,  Cy3-NPYsc  (scrambled  neuropeptide  Y),  and  Cy3-thrombin  were  also  evaluated  on  the 
extended  control  chip.  Cy3-IgE  and  Cy3-NPYsc  exhibited  enhanced  TFBS  binding,  but  also 
displayed  elevated  RB3  binding,  both  likely  due  to  the  free  dye  component  (Fig  17A-B).  IgE  has 
been  reported  to  NOT  demonstrate  binding  to  TFBS  in  previous  studies  with  PBS  buffers 

'IT  'jr 

containing  different  components  than  this  work  ’  ,  but  the  Ellington  group  has  reported  cross 
reactivity  of  TFBS  with  IgE  in  a  HEPES  buffer14.  So  there  may  be  some  potential  for  cross¬ 
reactivity  depending  on  the  buffer  composition.  Cy3-thrombin  was  a  relatively  pure  sample, 
which  displayed  GQ  binding  for  both  TFBS  and  RB3  which  exceeded  that  of  the  dye  alone  (Fig 
17C).  This  indicates  that  thrombin  could  specifically  bind  the  RB3  GQ  aptamer  in  addition  to 
TFBS.  It  is  difficult  to  say  that  the  higher  fluorescence  intensity  of  RB3  is  due  to  a  higher 
affinity  without  the  benefit  of  a  binding  curve.  Clearly  the  added  flexibility  of  a  longer  linker 
length  seems  to  aid  more  with  the  much  smaller  dye  binding  the  TFBS  (or  the  suspected  free 
dye  in  Fig  1 7 A)  rather  than  the  specific  thrombin  interaction  appearing  saturated  across  all 
lengths.  The  THBS  binding  of  thrombin  is  also  far  more  intense  than  that  of  dye  only  binding 
(Fig  1  IF),  differentiating  specific  thrombin  interactions  from  Cy3  interaction  (Fig  17D).  The 
added  flexibility  for  THBS  sequences  with  longer  linkers  drastically  increased  signal. 
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Figure  17:  Protein  Interactions  with  Multiple  GQs 

The  final  two  arrays  on  the  chip  were  meant  to  probe  the  GQ  aspect  of  dye  binding.  By  using 
the  original  buffer  the  RB2  and  RB3  aptamers  were  selected  in  (20  mM  HEPES,  300  mM  KC1,  5 
mM  MgCl2,  +  0.1%  Tween-20  and  1%  BSA  pH=  7.5),  if  GQ  formation  is  the  cause  of  dye 
binding,  inducing  the  RB2  to  fold  into  the  GQ  structure  would  promote  binding  to  Cy3  (Fig 
1 8A).  Binding  to  RB2  was  restored  in  the  RB  aptamer  binding  buffer,  validating  the 
hypothesis  that  GQ  formation  is  the  cause  of  dye  binding.  Interestingly,  the  RB3  aptamer 
displayed  decreased  fluorescence  in  the  RB  buffer  compared  to  the  normal  chip  conditions  ((8.1 
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mM  Na2HP04,  1.1  mM  KH2P04,  2.7  mM  KC1,  137  mM  NaCl,  pH  7.4)  +  1  mM  MgCl2  +0.1% 
Tween-20  and  1%  BSA)  (see  Fig  16A).  When  the  dye  was  added  in  buffer  without  K+  (8.1  mM 
Na2HP04,  137  mM  NaCl,  pH  7.4)  +  1  mM  MgCl2  +0.1%  Tween-20  and  1%  BSA),  RB2  binding 
was  destroyed  as  expected,  but  RB3  and  TFBS  performed  at  least  as  well  as  when  K+  ions  were 
present  (Fig  18B). 

GQ  formation  in  DNA  has  been  reported  to  be  highly  dependent  on  monovalent  cations34.  The 
stability  of  the  structures  is  claimed  to  be  high  in  the  presence  of  K+  because  the  metal  ion  has  an 
ionic  radius  between  1.3-1. 5  A,  an  ideal  size  to  fit  within  the  two  GQ’s  of  the  complex37  (see  Fig 
1 8).  The  metal  ion  is  believed  to  shift  the  equilibrium  of  a  distribution  of  random  and  GQ  folded 
TFBS  to  the  GQ  structure,  aiding  in  thrombin  binding38.  Some  reports  state  that  the  TFBS  GQ  is 
not  detected  without  K+  in  the  system39,  while  others  believe  K+  is  more  beneficial  near  the 
melting  temperature  of  TFBS38.  Baldrich  and  coworkers  found  that  increasing  the  KC1 
concentration  from  10-100  mM  increased  binding  to  thrombin,  while  binding  was  reduced  to  the 
level  of  PBS  buffer  alone  with  the  KC1  concentration  was  increased  to  5  pM.  In  contrast  to  the 
Baldrich  work,  Tang  and  coworkers40  observed  a  decrease  in  binding  from  10-50  mM  KC1  for 
TFBS  and  thrombin,  but  they  were  using  a  HEPES-based  buffer  with  10  mM  MgCl2.  Na+  and 
Mg2+  have  been  shown  to  form  weaker  complexes  with  the  GQ  and  lower  the  melting 
temperature  due  to  the  larger  ionic  radii.  Hianik  and  co workers41  saw  that  increasing  the  Na+ 
concentration  reduced  thrombin  binding  in  a  Tris  binding  buffer,  and  another  group  reported  that 
Na+  reduced  the  stability  of  GQ’s  irrespective  of  thrombin42.  Baldrich  likewise  reports  that 
adding  Na  or  Mg  to  a  solution  with  KC1  will  lower  the  binding  .  Baldrich  concludes  overall 
that  at  low  temperature,  K+  is  not  critical  for  thrombin  binding.  This  body  of  research  explains 
why  TFBS  does  not  appear  to  be  affected  by  the  riboflavin  buffer,  or  the  buffer  lacking 
potassium. 

It  may  also  give  insight  into  why  RB3  shows  lower  binding  in  the  RB  selection  buffer  than  in  the 
normal  chip  conditions.  Recall  that  two  different  groups  reported  a  decrease  in  binding  at  high 
KC1  concentrations.  While  the  K+  may  stabilize  the  GQ  structure,  it  may  actually  cause  more 
difficulty  to  bind  a  target  due  to  either  a  competition  effect,  or  by  creating  such  a  rigid  structure 
that  it  cannot  adapt  to  interact  strongly  with  the  target.  Song  and  coworkers  observed  that  while 
K+  might  stabilize  the  structure  of  THBS43,  it  actually  reduces  the  ability  of  THBS  to  form  a 
complex  with  thrombin.  So,  in  essence,  RB3  may  have  been  overloaded  with  KC1  and  “locked” 
into  a  rigid  conformation  that  could  not  interact  with  the  Cy3  dye  as  much.  Results  for  both 
riboflavin/RB3  aptamer34,  and  thrombin/TFBS38  show  that  the  target  itself  can  stabilize  the 
respective  aptamer,  and  Baldrich  also  claims  that  the  presence  of  thrombin  itself  can  promote 
GQ  formation  in  the  TFBS.  When  the  buffer  without  K+  was  involved  for  the  Cy3  studies,  the 
more  flexible  structure  slightly  stabilized  by  Na+  can  recognize  the  dye  and  change  conformation 
to  accommodate  the  target. 

Tasset  and  coworkers  contend  that  the  THBS  aptamer  consists  of  a  GQ  that  requires  extra  spatial 
mobility  to  form  due  to  the  longer  duplex  between  the  5’  and  3’  terminus44.  They  also  report  that 
the  substitution  of  T  for  A  at  position  4  (Fig  18)  destroys  the  T4:T13  base  pair  present  in  the 
TFBS,  reducing  the  stability  of  the  GQ  compared  to  TFBS.  This  explains  why  we  see  binding  of 
Cy3  dye  to  THBS  (Fig  1 1G)  that  increases  substantially  with  linker  length,  and  that  is  lower 
affinity  than  binding  to  TFBS.  (Note  the  drastic  difference  across  linker  lengths  for  Cy3- 
thrombin  binding  THBS  in  Fig  17D.)  Tasset  also  reports  that  THBS  is  not  dependent  on  K+,  and 
we  see  improved  binding  in  the  buffer  lacking  K+  (Fig  18C).  The  RB  buffer  produces  the  next 
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highest  level  of  fluorescence,  and  also  has  the  highest  K+  concentration.  Potentially,  while  the 
THBS  doesn’t  depend  on  K+,  the  increased  MgCf  concentration  is  known  to  stabilize  non-GQ 
aptamers  and  may  aid  in  stabilizing  the  stem  portion  which  may  indirectly  stabilize  the  GQ,  or 
the  sequence  may  prefer  HEPES  buffer  to  PBS45.  The  GQ  structure  of  THBS  has  been 
proposed  but  no  evidence  has  been  presented  to  validate  the  structure  before  the  microarray 
work.  Microarrays  may  provide  an  avenue  for  studying  structures  of  proposed  aptamers,  and 
characterizing  their  behavior  under  changing  conditions. 


Figure  18:  Cy3  Interactions  in  Changing  Buffer  Conditions34,42,44 

Finally,  the  arrays  were  analyzed  to  determine  if  new  binders  to  any  of  the  Cy3  proteins  were 
apparent.  In  addition  to  the  control  sequences,  either  7,000  (standard  controls)  or  5,000 
(extended  controls)  different  sequences  from  PT1  were  present  in  duplicate.  As  described 
previously,  we  calculated  the  mean  and  the  range  to  evaluate  potential  binders.  It  is  not  ideal  to 
use  the  8X1 5k  arrays  to  identify  new  binding  sequences  due  to  the  relatively  low  library 
representation,  few  replicates,  and  generally  lower  diversity  of  different  sequences  compared  to 
the  1X1 M  arrays.  However,  strong  binders  demonstrating  fluorescence  intensity  values  high 
above  background  values  with  reasonably  small  ranges  may  still  be  identified. 

Of  all  the  proteins  tested,  only  thrombin  bound  to  sequences  with  intensities  significantly  above 
background.  The  top  ranked  sequence,  4A018  (Fig  19A)  demonstrated  a  mean  fluorescence 
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intensity  value  >100X  greater  than  background  (SA-apt),  and  the  range  is  <15%  of  the  mean. 

The  top  5  ranked  sequences  (disregarding  the  Tio  linker)  were  aligned  and  analyzed  for  sequence 
homology  using  MAFFT  (Fig  19B1).  The  majority  of  the  perfect  sequence  homology  appears 
in  the  external  stem  region  of  the  DNA,  but  internal  homology  is  observed  when  two  sequences 
are  directly  compared.  All  binding  sequences  have  5  or  6  G  runs.  We  also  added  the  top  5 
thrombin  binders  into  the  WebLogo  program  to  generate  the  consensus  sequence  shown  in 
Figure  19B2.  The  height  of  the  bases  corresponds  with  how  strong  the  homology  is  at  a  certain 
position.  Note  that  the  5  sequences  contain  matches  even  in  the  N  region  of  PT1,  designated 
with  a  green  box.  This  consensus  sequence  was  aligned  with  the  TFBS  in  MAFFT  (Fig  19B3), 
showing  that  the  consensus  sequence  was  a  perfect  match  for  12  of  15  bases  in  TFBS.  Cy3-only 
binding  to  the  top  ranked  DNA  was  not  considered  since  we  know  that  thrombin  binds  GQ 
DNA,  and  we  just  showed  dyes  can  bind  GQ’s  as  well. 

Next,  we  wanted  to  determine  if  the  PT1  library  was  biased  in  favor  of  the  binding  sequences  by 
a  high  representation  of  potential  thrombin  binders  present  in  the  5k  library.  This  was 
accomplished  by  counting  the  on-chip  abundance  of  some  of  the  different  fragments  possible 
within  the  context  of  the  pattern  compared  to  the  TFBS  sequence  (which  we  consider  the  first 
several  bases  to  be  “required”  for  binding,  for  simplicity).  PT1  allows  for: 

PT1-  (RRYYRRYY)-N4-(RRYY)-N3-(RRYY)-N4-(RRYY)-N3-(RRYY)-N4-(RRYYRRYY); 
R=(A,G)  -  purines,  Y=(T,C)  -  pyrimidines,  N=(A,C,G,T)  -  random. 

Some  relevant  potential  fragments  of  the  first  several  bases  within  this  context  and  the 
abundance  (of  5k  total): 


GGTTGGTT  -  84 
GGTTGGT  -  194 
GGTTGG-  406 


GGCCGGCC-  89 
GGCCGGC-  159 
GGCCGG-  386 


GGTGTGG-  0  (Because  4  and  8  positions  are  locked  as  T  or  C,  not  G) 

A  ATT  A  ATT  -  16 
AATTAAT-  59 
AACCAACC-43 
AACCAAC-  78 
TFBS:  GGTTGGT GT GGTT GG. 

The  6  base  GGTTGG  present  in  the  TFBS  accounts  for  406  of  the  PT1  sequences,  or  8.1%  of  the 
5,000  total  oligonucleotides.  Expanding  this  to  the  first  7  bases  of  TFBS  (GGTTGGT)  means 
that  3.9%  (194)  of  the  library  present  meet  the  criteria  for  binding.  When  compared  to  the 
sequences  with  C’s  substituted  for  T’s  (non-binding),  GGCCGG  and  GGCCGGC  constitute 
7.7%  (386)  and  3.2%  (159),  respectively.  Oligonucleotides  beginning  with  A  were  less  abundant 
than  G  counterparts  of  similar  length,  but  they  would  still  be  grouped  into  the  nonbinding 
category.  So  the  potential  binders  (considered  to  begin  with  GGTTGG  as  in  the  TFBS)  do  not 
dominate  the  library,  and  the  fact  that  the  top  5  ranked  sequences  all  contain  this  motif  (with  the 
exception  of  a  C  in  position  3  for  4A202)  is  a  strong  indicator  of  to  thrombin  binding. 

Sequences  with  the  GGTTGGTG  or  GGTGTGG  motif  will  not  be  present  because  the  PT1 
pattern  does  not  allow  for  G  at  positions  4  or  8.  SA-Apt  is  provided  as  a  range  of  ranks  based  on 
the  highest  and  lowest  fluorescence  intensity  values  for  comparison  (Fig  19A). 
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|  Rank 

ID 

Mean 

Ranee 

1 

4A018 

12650 

1700 

2 

1A175 

11350 

2500 

3 

1A523 

9640 

300 

4 

2A353 

8905 

190 

5 

4A202 

6945 

190 

6 

0A044 

4275 

790 

7 

2A133 

2760 

2340 

8 

3A645 

2405 

1670 

9 

3A325 

2355 

670 

10 

1A529 

2210 

120 

11 

3A589 

1755 

10 

12 

1A926 

1654.5 

1411 

13 

0A418 

1610 

780 

14 

0A189 

1475 

90 

15 

4A208 

1285 

150 

1884-4948 

SA-Apt 

104 

12 

B)  Binder  Alignment 

Bl)  CLUSTAL  format  alignment  by  MAFFT  (v6.901b) 

Base  position:  9-12  17-19  24-27  32-34  39-42 

4A018  ggttggtttttcaatcagcgatcgcggaatccagggttaggcggccaacc 

2A353  ggttggtcattaaattggaaattgcggagctgacaattagaaagccggcc 

1A523  ggttggctcgggagctcctgacccgtgggttggaaattaaaaagccggcc 

1A175  ggttggtttattggtttatggccgcgcagccatcagtcctagggccgacc 

4A202  ggctggtcggttggttagtagtctcgcggccgcgggtcccataaccagcc 


B2) 


Consensus:  GGTTGGTTNNTNAGTTNGTGATCGCGGAGCCGNNAGTTANANAGCCGGCC 
B3)  CLUSTAL  format  alignment  by  MAFFT  (v6.935b) 


Consensus  ggttggttnntnagttngtgatcgcggagccgnnagttananagccggcc 

TFBS  ggttggt - gtggttgg - 


C)  Random  Sequences  Alignment  C2)  clustal  format  alignment  by  mafft  (v6.9oib> 


4A019  aaccagttatcgagtcggcagtctgctgatctaagaccacctggctggtt 
4A203  aaccgattagcaagcttatgatctataagtctgcgattgaacaatcggtt 
2A354  ggccgacctggaggctataggccctcaagcctccagtctaaggattggcc 
1A176  ggttaatcccgtggcttttaaccgtgtggtcctgggccccgcagttaacc 
1A524  gactggttccatagctctcagctacacggttactagtttggcggccagtc 


C3) 

CLUSTAL  format  alignment  by  MAFFT  (v6.935b) 


Consensusnb  gaccgattnnnnagctntnagccnnnnggtctnnggtcnnncggttggtc - 

TFBS  - ggttggtgtggttgg 


Consensus:  GACCGATTNNNNAGCTNTNAGCCNNNNGGTCTNNGGTCNNNCGGTTGGTC 


Rank 

12 

Mean 

Range 

4096 

4A019 

103 

2 

1917 

4A203 

106 

4 

4171 

2A354 

103 

2 

369 

1A176 

219.5 

233 

1808 

1A524 

106.5 

1 

Figure  19:  Identification  of  Potential  Thrombin  Binding  Aptamers 

The  next  five  sequential  sequences  following  the  top  five  binders  (ex.  4A018  is  the  top  ranked 
binder,  4A019  is  analyzed  as  a  “random”  sequence)  were  also  evaluated  (Fig  19C).  These 
sequences  demonstrated  low  fluorescence  intensity  values  (Fig  190)  as  well  as  minimal 
structural  homology  with  each  other  (Fig  19C2)  since  there  is  only  one  perfect  match.  In  the 
order  shown  in  Figure  190,  the  oligonucleotides  have  3,  1,  6,  4,  and  4  G  pairs,  respectively. 
Thus,  higher  amounts  of  G  pairs  will  only  promote  binding  if  the  surrounding  sequence  is  correct 
and  has  potential  to  fold  into  a  functional  GQ  structure.  Sequence  1A176  is  ranked  369  in  mean 
fluorescence  intensity,  but  the  range  is  higher  than  the  mean  because  the  mean  is  calculated  from 
one  low  point  (FI=  103,  close  to  the  mean  of  the  background  SA-Apt)  and  one  relatively  high 
point  (FI=  336).  Thus,  the  higher  point  is  likely  more  a  function  of  experimental  error  from 
having  only  two  replicates  rather  than  a  real  binding  event.  The  consensus  sequence  evaluation 
of  the  random  sequences  (Fig  19C3)  shows  fewer  base  matches  (smaller  height  of  bases 
compared  to  Figure  19B2)  in  addition  to  a  smaller  degree  of  alignment  in  the  random  regions 
(green  boxes).  Alignment  with  the  TFBS  showed  several  bases  at  the  3’-end  matching  (Fig 
19C4),  although  the  two  G  runs  in  this  homologous  region  is  not  sufficient  to  form  the  GQ 
structure  reportedly  necessary  for  thrombin  binding.  The  conclusions  are  that  top  binders 
display  a  concrete  sequence  homology  with  each  other  and  with  the  TFBS  aptamer,  and  a 
random  sampling  of  non-binders  has  minimal  homology. 
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Testing  of  binding  was  performed  via  SPR  with  4A018  and  thrombin  utilizing  the  schematic 
depicted  in  Figure  20A.  Figure  20B  shows  a  summary  of  the  amounts  of  the  components 
immobilized  in  each  flow  channel  (FC),  and  Figures  20C-E  are  representative  sensorgrams  for 
neutravidin  immobilization  on  FC4  (Fig  20C),  and  4A018  immobilization  on  FC4  (Fig  20D)  and 
FC3  (Fig  20E).  Figure  21  shows  the  reference  (FC1)  subtracted  results  from  adding  a  range  of 
thrombin  concentrations  to  the  chip.  Figures  21A-B  are  the  reference  subtracted  sensorgrams  for 
FC3-1  and  FC4-1,  displaying  a  trend  of  increasing  response  as  the  thrombin  concentration  is 
increased.  FC2-1  did  not  show  binding  to  thrombin  (data  not  shown).  The  triplicate  data  from 
the  sensorgrams  was  used  to  construct  the  affinity  curves  in  Figures  21C-D.  FC3-1  reported  a 
Kd=  1.82  ±  0.16  pM  (Fig  21C),  while  FC4-1  Kd=  2.88  ±  0.10  pM.  The  sensorgrams  were  also 
used  to  create  kinetics  curves  (Fig  21E-F)  that  were  fit  to  obtain  ka  and  kp,  and  these  rate 
constants  were  used  to  independently  calculate  Kd(Fig  2 IE  inset). 


B) 

FC1 
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14401.0 
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1754.6 

2482.0 

Figure  20:  SPR  Strategy  and  Immobilization  of  4A018 

Finally,  4A018  response  was  also  tested  in  triplicate  versus  three  other  proteins  as  a  control 
experiment.  Proteins  also  with  abundance  in  the  body  and  similar  in  molecular  weight  to 
thrombin,  bovine  serum  albumin  (BSA)  and  human  serum  albumin  (HSA),  as  well  as  a  protein 
with  a  pi  close  to  that  of  thrombin,  neuropeptide  Y  (NPY)  were  chosen  as  controls.  Figures 
22A-C  show  the  sensorgrams  demonstrating  minimal  response  for  BSA,  HSA,  and  NPY  on 
FC4-1.  The  summary  of  protein  response  for  FC3-1  and  FC4-1  are  depicted  in  Figure  22D-E. 
This  shows  the  specificity  of  the  4A018  interaction  for  thrombin  over  other  physiological 
proteins  of  similar  size  or  pi.  Ongoing  work  is  being  carried  out  to  determine  the  structural 
properties  of  4A018  as  well  as  to  characterize  the  affinity  of  some  of  the  other  array-discovered 
probes  for  thrombin. 
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Figure  21:  Thrombin  Interaction  with  4A018 
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Figure  22:  4A018  SPR  Response  to  Control  Proteins  and  Overall  Binding  Summary 
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5.0  CONCLUSIONS 

A  Summary  of  significant  findings  includes: 

1 .  Linker  length  increases  will  increase  fluorescence  intensity  by  extending  the 
probes  from  the  surface,  and  some  minimal  length  is  needed  to  obtain  reliable 
data. 

2.  The  linker  identity  (A,  C,  G,  or  T)  affects  fluorescence  intensity,  with  T  base 
linkers  proving  reliable  across  all  linker  lengths  and  target  concentrations. 

3.  A  direct  (Cy3 -protein)  and  indirect  (biotin-protein  followed  by  Cy3-streptavidin) 
target  labeling  method  were  compared,  and  it  was  shown  that  the  indirect  method 
both  enhances  S/N  by  orders  of  magnitude  and  decreases  nonspecific  binding  to 
controls. 

4.  Trends  of  probe  behavior  were  recognized,  with  the  in  vitro  library  designed  to 
have  more  G-rich  stretches  appearing  to  display  enhanced  binding  to  the  biotin 
and  Cy3  tags. 

5.  Known  G-quartet  aptamers  to  non-protein  targets  bound  the  labeled  protein  with 
variable  intensities  depending  on  the  target  composition,  purity,  concentration, 
and  stability  of  the  DNA  under  experimental  conditions.  The  small  molecule  tags 
were  validated  as  the  source  of  binding  because  neither  the  protein  target  alone 
nor  the  Cy3-SA  interacted  with  the  controls. 

6.  Potential  aptamers  to  biotin  and  Cy3  were  unearthed  by  using  a  mean 
fluorescence  intensity  ranking  system  to  cross-reference  the  performance  of 
probes  with  their  presentation  in  various  experiments  including  comparing  Cy3- 
protein,  Cy3  dye  alone,  and  biotin-protein  binding. 

7.  The  planar  and/or  aromatic  reporter  tags  interacted  extensively  with  the  DNA. 

8.  The  surface  density  of  the  arrays  is  too  low  to  expect  binding  from  a  random 
library,  so  starting  library  optimization  must  be  performed.  A  thrombin  binder 
was  identified  and  characterized  from  the  patterned  library. 

9.  Microarray  experiments  generate  massive  amounts  of  data  that  can  be 
problematic  to  organize  and  analyze. 

10.  Purification  of  free  dye  from  dye-protein  conjugates  is  essential,  and  the 
manufacturer’s  method  is  insufficient  for  aptamer  identification  experiments. 

The  DNA  microarray  work  in  this  paper  was  designed  to  support  three  different  initiatives:  1) 
To  show  that  microarrays  can  be  utilized  to  identify  protein  aptamer  candidates;  2)  Non- 
canonical  G-quartet  DNA  can  be  detected  by  Cy3  (or  biotin)  binding  on  the  microarray;  3) 
Aptamers  to  the  small  molecule  biotin  and  Cy3  tags  can  be  discovered  by  protein  conjugation. 
Towards  the  first  goal,  we  observed  that  increasing  linker  length  will  increase  fluorescence 
intensity  by  extending  the  probes  from  the  surface,  and  some  minimal  length  is  needed  to 
obtain  reliable  data.  The  linker  identity  also  affects  fluorescence  intensity,  with  T  base  linkers 
proving  reliable  across  all  linker  lengths  and  target  concentrations.  We  compared  a  direct  and 
indirect  target  labeling  method,  and  found  that  the  indirect  method  both  enhances  S/N  by 
orders  of  magnitude  and  decreases  nonspecific  binding  to  controls.  Trends  of  probe  behavior 
were  recognized,  with  the  in  vitro  library  designed  to  have  more  G-rich  stretches  appearing  to 
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display  enhanced  binding  to  the  biotin  and  Cy3  tags.  Several  probes  were  identified  with 
potential  for  IgE  binding  with  structures  different  from  those  previously  reported  in  the 
literature.  Most  likely,  strong  candidates  were  not  observed  because  the  patterns  used  in  the 
libraries  does  not  allow  for  “good”  IgE  binders.  IgE  aptamers  may  require  a  fairly  rigid  base 
sequence  that  was  not  obtainable.  If  strong  binders  were  present,  they  would  likely 
preferentially  bind  IgE  over  the  free  dye,  and  present  with  enhanced  signals  similar  to  thrombin 
v.  Cy3  binding. 

We  also  studied  the  behavior  of  the  TFBS,  THBS,  and  RB  aptamers  to  examine  G-quartet 
properties  on  the  arrays.  G-quartets  bound  to  targets  with  variable  intensities  depending  on  the 
target  composition,  concentration,  and  stability  of  the  DNA  under  experimental  conditions. 

The  small  molecule  tags  (dye  or  biotin)  were  validated  as  the  source  of  binding  because  neither 
the  IgE  alone  nor  the  Cy3-SA  interacted  with  the  controls.  We  were  able  to  use  the  tags  to 
study  known  GQ  aptamers  under  changing  conditions,  and  elucidated  interesting  structural 
information  on  the  probes.  Literature  reports  on  THBS  structure  and  behavior  was  validated 
using  microarrays.  Therefore,  DNA  microarrays  can  be  used  to  assay  new  and  existing 
aptamers  to  screen  for  G-quartet  formation  by  simply  adding  dye.  Interesting  studies  of 
aptamer  stability  under  various  buffer,  temperature,  etc.  conditions  could  also  be  performed  in 
a  multiplexed  format  using  microarrays.  Potentially,  microarrays  could  be  used  to  quickly 
screen  probes  for  function  that  will  be  immobilized  on  surfaces  such  as  nanoparticles. 

Potential  aptamers  to  biotin  and  Cy3  were  unearthed  by  using  a  ranking  system  to  cross- 
reference  the  performance  of  probes  across  the  various  experiments.  At  the  least,  these 
aptamers  will  enrich  the  field  of  small  molecule  aptamers  in  a  high-throughput  platform.  The 
aptamers  could  be  utilized  as  positive  controls  in  future  Cy3/biotin  microarray  work.  It  shows 
the  feasibility  of  conjugating  small  molecule  targets  to  a  larger  reporting  system  in  which 
binding  occurs  between  the  small  molecule  rather  than  the  reporter.  This  could  be  used  to 
develop  small  molecule  aptamer  discovery  protocols,  or  a  dye  displacement  strategy  can  be 
employed  from  an  in  vivo-constructed  library  of  DNA  molecules  with  similar  structures. 
Eventually  we  plan  to  use  microarrays  as  a  screening  tool  for  patterned  libraries.  We  will 
screen  for  libraries  with  a  low  selection  rate  and  high  structural  diversity  in  silico,  then 
evaluate  them  on  the  arrays  for  small  molecule  binding.  “Winning”  patterns  will  be  applied  to 
variations  of  SELEX  to  increase  the  selection  efficiency  and  decrease  required  labor. 

Finally,  we  were  able  to  identify  and  confirm  thrombin  aptamers  using  only  5k  sequences  from 
a  partially  structured  library.  The  library  was  not  specifically  designed  to  contain  thrombin- 
binders,  but  motifs  similar  to  TFBS  demonstrated  the  highest  signal.  Applying  thrombin  to  a 
1X1M  array  would  likely  provide  even  higher  affinity  aptamers.  Thrombin  was  a  “better” 
target  candidate  than  IgE  due  to  the  potential  to  bind  a  range  of  similar  structures  that  form 
GQs. 

There  are  several  major  drawbacks  associated  with  using  DNA  microarrays  for  aptamer 
selection.  Most  notably,  the  planar  and/or  aromatic  tags  interacted  extensively  with  the  DNA. 

It  is  extremely  difficult  to  determine  whether  lower-than-positive  control  level  binding  is  real 
target  binding  or  nonspecific  tag  interaction.  The  purity  of  targets  from  free  dye  after 
conjugation  should  be  a  priority,  and  the  manufacturer’s  recommendation  of  a  desalting 
column  is  insufficient  when  new  binders  are  being  pursued.  The  indirect  labeling  method  was 
able  to  lower  suspected  nonspecific  binding,  but  an  ideal  strategy  would  not  require  a  label  at 
all.  Secondly,  the  surface  density  of  the  arrays  when  compared  to  SELEX  is  too  low  to  expect 
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binding  from  a  random  library.  If  only  1  in  1X1 09  sequences  are  binders  in  a  random  SELEX 
library6,  extraordinary  fortune  would  be  required  for  1  sequence  to  show  any  binding  on  an 
array.  Design  of  the  starting  library  has  shown  promise,  but  the  methods  from  the  literature 
largely  take  into  account  binding  behavior  from  previously  existing  aptamers  for  the  target. 

The  most  interesting  targets  will  be  those  which  do  not  currently  have  an  aptamer  available. 
Also,  gene  expression  is  the  marketed  application  for  the  microarrays,  and  evolving  a  protocol 
to  suit  aptamer  studies  can  require  significant  effort.  For  example,  the  Agilent  arrays  were 
designed  for  a  small  air  space  to  populate  each  array  so  rotation  of  the  air  space  will  enhance 
the  mass  transfer  of  the  target  to  the  surface.  However,  this  action  requires  an  additive  that 
changes  the  surface  tension  of  the  binding  buffer,  which  would  also  affect  the  binding  of  the 
DNA.  Instead,  we  overfilled  the  arrays  to  take  up  the  space  otherwise  occupied  by  the  air 
bubble,  which  causes  a  reliance  purely  on  target  diffusion.  The  array  manufacturing  time  takes 
an  average  of  4-5  weeks  to  possess  the  chips  in-hand.  This  tends  to  impede  progress  when 
relying  on  the  results  of  initial  experiments  to  change  or  improve  the  next  generation  of  chips. 
Another  challenge  many  may  not  consider  is  how  to  organize  and  analyze  the  massive  amounts 
of  data  generated  by  one  1X1 M  experiment.  A  simple  Excel  spreadsheet  is  insufficient  to 
work  with  the  data,  especially  when  different  amounts  of  replicates  are  evaluated  and 
parameters  such  as  mean  fluorescence  intensity  and  standard  deviation  of  the  replicates  must 
be  calculated.  Therefore,  code  to  carry  out  these  functions  had  to  be  written  and  tested  before 
we  could  begin  to  assess  the  results.  Basic  data  analysis  accounted  for  5-1  OX  additional  time 
consumption  compared  to  one  physical  experiment  alone.  More  complex  functions  such  as 
cross  referencing  between  experiments  and  analysis  of  multiple  trends  also  required  separate 
programming,  although  software  such  as  GeneSpring  does  exist  that  can  perform  a  number  of 
the  necessary  processes.  Finally,  current  technology  is  only  capable  of  manufacturing  DNA 
microarrays,  but  the  company  MYcroarray  is  willing  to  collaborate  with  us  to  develop  RNA 
arrays.  Initial  work  with  DNA  microarrays  would  serve  as  proof-of-concept  studies  to  proceed 
with  RNA  microarray  development  compatible  with  riboswitch  biosensor  studies. 

Overall,  we  have  shown  that  DNA  microarrays  have  great  potential  in  rapid  aptamer 
identification  due  to  elimination  of  the  need  for  SELEX  cycling,  PCR,  cloning  or  sequencing, 
reducing  the  theoretical  timeline  to  one  day  of  experimental  work.  A  novel  thrombin  aptamer 
has  been  identified,  and  the  binding  affinity  of  the  probe  to  thrombin  has  been  quantified. 
Additionally,  an  aptamer  initially  selected  on  a  surface  provides  knowledge  that  the  sequence 
will  function  immobilized  on  a  biosensor  platform  setting.  Microarrays  are  capable  of  highly 
multiplexed  studies,  limited  only  by  the  initial  design  of  the  array.  For  example,  the  IgE 
binders,  biotin/Cy3  binding  sequences,  and  G-quartet  results  were  all  gleaned  from  different 
probes  and  trends  from  the  same  physical  experiments.  Once  some  of  the  challenges  of  the 
technology  are  addressed,  microarrays  are  a  viable  option  for  accelerated  aptamer 
identification  and  structural  study  provided  the  target  and  initial  library  are  a  good  match. 
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APPENDIX  A  -  DNA  Microarrays  for  Aptamer  Identification  and  Structural 

Characterization 


[IgE] 


Figure  Al:  Images  from  Experiment  in  Figure  6B 

Cutouts  in  Figure  Al  are  all  blowups  of  the  lower  right  comer  for  each  array.  Note  that  similar 
locations  fluoresce  for  the  different  conditions,  corresponding  to  the  positive  controls. 
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Table  Al:  Control  Values  from  10  nM  Cy3-SA  Only 
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Table  A2: 
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4AQ18 


Figure  A2:  Structures  of  Potential  Thrombin  Aptamers 
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LIST  OF  ACRONYMS/GLOSSARY 


A,  C,  G,  T,  R,  Y 

ATP 

BSA 

Cy3 

Cy5 

Cy3-BSA 

Cy3 -Estradiol 

Cy3-IgE 

Cy3-NPYsc 

Cy3-SA 

Cy  3 -thrombin 

DNA 

D/P 

FC 

FRET 

G-quartet  (GQ) 

HABA 

HPLC 

HSA 

IgE 

Kd 

LOD 

nt 

NMR 

NPY 

PCR 

PT1,2,3 

RB2 

RB3 

RNA 

S/N 

SA 

SELEX 


adenine,  cytosine,  guanine,  thymine,  R=  (A/G),  Y=  (T/C) 

adenosine  triphosphate 

bovine  serum  albumin 

Cyanine  dye3 

Cyanine  dye5 

Cyanine  dye3  conjugated  to  bovine  serum  albumin 
Cyanine  dye3  conjugated  to  estradiol  antibody 
Cyanine  dye3  conjugated  to  Immunoglobulin  E 
Cyanine  dye3  conjugated  to  scrambled  neuropeptide  Y  sequence 
Cyanine  dye3  conjugated  to  streptavidin 
Cyanine  dye3  conjugated  to  thrombin 
deoxyribonucleic  acid 
dye-to-protein  ratio 
flow  channel 

fluorescence  resonance  energy  transfer 
guanine  quartet 

4'-hydroxyazobenzene-2-carboxylic  acid 

high  performance  liquid  chromatography 

human  serum  albumin 

Immunoglobulin  E 

dissociation  constant 

limit  of  detection 

nucleotide 

nuclear  magnetic  resonance 
neuropeptide  Y 
polymerase  chain  reaction 
DNA  patterned  library  #1,  2,  or  3 

2- tiered  g-quartet  riboflavin  aptamer 

3 - tiered  g-quartet  riboflavin  aptamer 
ribonucleic  acid 
signal-to-noise  ratio 

mutated  non-binding  streptavidin  aptamer  (background) 
Systematic  Evolution  of  Ligands  by  Exponential  Enrichment 
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SPR 

ssDNA 

UV/Vis 

TFBS 

THBS 


surface  plasmon  resonance 
single-stranded  DNA 

ultraviolet/visible  light  absorption  spectroscopy 
thrombin  fibrinogen  binding  site  aptamer 
thrombin  heparin  binding  site  aptamer 
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